Defining a Data Model
Modelling entities, relationships, signals and time series in your data
Exabel allows you to create flexible data models to represent your data, and subsequently query, transform and analyze the data to derive investment insights.
You can create new entities (eg brands that belong to a company) and relationships (eg ownership by a company of brands), and link these to the existing company entities already provided by Exabel. Then, you can import time series data on any entity in your data model, and use our signal DSL to query and transform this data.
Example - why is this important?
Gap Inc. owns several brands, including Banana Republic. Banana Republic has stores across many countries. You have data at a very granular level, showing Gap's sales performance by brand and also by country.
An analyst may want flexibility to deep-dive into Gap at this granular level (Banana Republic in the USA), but equally also want to roll the lower-level metrics up to understand Gap's overall performance.
A well-defined data model will allow the analyst a lot of flexibility to:
- Query and analyze data on a granular level
- Systematically aggregate data up from a lower entity-level to a higher one
Best practice: data models
Data models do not have to be complex! In many cases, simple company-level data models will suffice.
Aim to ultimately connect entities to the
companyentity type, in order to enable aggregation up to company-level.Investment users (analysts / PMs) should define up-front the granularity of analysis desired. This will help your data engineering team design the data model and ensure that the right data is imported.
Before reviewing the example data models below, it is important to understand these key concepts:
Example #1: company-level data
The simplest data model, applicable if your data exists at company-level, or you choose to pre-aggregate as such:
Notes
- A single
salessignal is defined for company entities, and encapsulates multiple time series - one for each company companyis a global entity type that is provided by the Exabel platform- Company entities are pre-loaded on the Exabel platform, so you don't have to import these
Steps
- Import time series data for each company, specifying
salesas the signal. Whether you are using the file uploader or Exabel SDK, you will be able to specify the signal that each time series is tied to.
Example #2: companies with child entities
Another common data model has lower-level child entities that belong to each company. This example has segment child entities - these might be business or geographical segments, depending on your data:
Notes
- You can link each company to multiple segment entities.
- We create segments as child entities of companies, because each segment can only be owned by 1 company
- We define the
salessignal are imported at both segment- and company-level, and import time series at both levels. Alternatively, you could choose to define the signal and import data only at segment-level, and aggregate to company-level dynamically in the Exabel platform. - The
HAS_SEGMENTrelationship type is set to be an ownership relationship. This denotes that the from-entity (company) "owns" the to-entity (segment). In most cases, you will be creating ownership relationships.
Steps
You will need to use the Exabel SDK in order to import entities, relationships, and non-company signals.
- Create the
segmententity type in your namespace, as this is not a pre-defined global entity type.
For now, you will need to ask Exabel to create these for you. - Import all segment entities that exist in your data, with the
segmententity type. - Import relationships connecting each segment to a company, using the
HAS_SEGMENTrelationship type. - Import time series data for each brand and each company, specifying
salesas the signal.
Example #3: multiple top-level entities
Your data may have additional dimensions that you want to model, beyond the standard company-level entities.
In this example, we have an alternative data set for company job postings, segmented by both company and occupation (eg "sales jobs"):

Notes
- We model
occupationas a top-level entity type, because the data set defines a limited number of standardized occupations that are seen across all companies. - We create a
company_and_occupationassociative entity type that is owned by both acompanyandoccupation. - Example entities in this model might be:
occupation:sales,engineeringcompany: Apple, Microsoftcompany_and_occupation:apple_sales,apple_engineering,microsoft_sales,microsoft_engineering
- There are 3 signals defined for the
company_and_occupationentity:jobs_created,jobs_deleted, andjobs_active.- The raw signals can therefore be retrieved for a given
company_and_occupationentity. - You may also create signals that aggregate these up to company-level or occupation-level, by using the signal DSL.
- The raw signals can therefore be retrieved for a given
- The
jobs_active_durationsignal is defined for all 3 entity types. This means that you must import time series data for this signal, for entities across all 3 entity types.
Steps
You will need to use the Exabel SDK in order to import entities, relationships, and non-company signals.
- Create the
occupationandcompany_and_occupationentity types in your namespace, as these are not pre-defined global entity types. Definecompany_and_occupationas an associative entity type.
For now, you will need to ask Exabel to create these for you. - Import all
occupationandcompany_and_occupationentities that exist in your data. - Import relationships connecting each
company_and_occupationto the appropriatecompanyandoccupation, using the corresponding relationship type. - Import time series data for the
jobs_created,jobs_deleted, andjobs_activesignals, for thecompany_and_occupationentities. - Import time series data for the
jobs_active_durationsignal, for thecompany_and_occupation,company, andoccupationentities.
Updated almost 2 years ago