added

Improved company calendars and aggregation to fiscal periods

We're excited to announce that we've recently released several major improvements to our signal expression language (DSL) for retrieving company calendars, aggregating alternative data to their fiscal periods, and calculating change including period-to-date change.

Background

One of the most common alternative data workflows is to aggregate high-frequency signals (e.g. daily card transactions) to company fiscal quarters, to use as a predictor or proxy for quarterly KPIs. This is cumbersome to do at scale due to varying fiscal calendars across companies, and general-purpose analytics tools are ill-equipped to handle this.

Exabel solves for this in our signal expression language (DSL) for time series analysis. We provide functions to easily retrieve company calendars by leveraging FactSet fundamental and actuals data, aggregate alternative data to these fiscal periods (quarters/semi-annuals/annuals), and calculate change (e.g. YoY). Best of all, these expressions can be written once and applied across hundreds or thousands of companies, saving time and allowing for consistent analysis.

In the past month, we have released enhancements to make these core functions more robust and easier to use with a dramatically simplified syntax.

Improved company calendars

We have introduced a new company_calendar() function that improves on the prior fiscal_calendar():

  • Support for retrieving semi-annual and annual calendars - previously, quarters were always returned, even if the company reports semi-annually.
  • Smart selection of quarters or semi-annuals depending on the company - this means that you can now use a single expression that works across companies reporting quarterly and semi-annually
  • More robust handling of missing periods (due to inevitable data issues) - automatic imputation of missing periods by default.
  • More robust handling of calendar frequency changes - e.g. when a company switches between quarterly and semi-annual reporting, or between different quarterly calendar frequencies.
  • Works on non-company entities! If an entity (e.g. merchant, web domain) is owned by a single company entity via ownership relationships, we automatically find that company and return its calendar.
# New

company_calendar()         # Fiscal quarters or semi-annuals, depending on company
company_calendar('FQ/FS')  # Fiscal quarters or semi-annuals, depending on company
company_calendar('FQ')     # Fiscal quarters only
company_calendar('FS')     # Fiscal semi-annuals only
company_calendar('FY')     # Fiscal years only

company_calendar('FQ/FS', extend=True)  # Also predict future periods
company_calendar('FQ/FS', extend=2)     # Predict only 2 future periods


# Old

fiscal_calendar()          # Always returns fiscal quarters

Reference: DSL documentation for company_calendar()

Aggregation to fiscal periods

Building on the new company calendar, the resample() function now supports resampling to company fiscal periods (FQ/FS, FQ, FS, FY, etc):

  • By using the FQ/FS smart frequency, you can now create a single expression that works across companies reporting quarterly and semi-annually.
  • Simplified and more concise syntax.
  • Also supports all the standard pandas calendar frequencies (W, M, Q, etc) and aggregation methods.
  • New Exabel-specific mean_times_days method.
  • Advanced parameters: control whether to include the first / last periods with partial data, and the minimum number of points required in each period.
# New

signal.resample('FQ/FS', 'sum')  # Fiscal quarters or semi-annuals, depending on company
signal.resample('FQ', 'sum')     # Fiscal quarters
signal.resample('FS', 'sum')     # Fiscal semi-anuals
signal.resample('FY', 'sum')     # Fiscal years

signal.resample('FQ/FS', 'mean')             # Mean
signal.resample('FQ/FS', 'median')           # Median
signal.resample('FQ/FS', 'mean_times_days')  # Mean-times-days


# Old - verbose

signal.aggregate_over(fiscal_calendar(), 'sum')  # Only aggregates to fiscal quarters
signal.aggregate_over(actual('sales', alignment='afp', period='s'), 'sum')  # Fiscal semi-annuals
signal.aggregate_over(actual('sales', alignment='afp', period='a'), 'sum')  # Fiscal years

📘

What is the mean_times_days method for?

Often, with alternative data signals that represent "flow", such as daily credit card spend or web/app traffic, analysts want to sum these over a quarter, rather than take the mean / median. A sum will better represent reality when companies have quarters that vary in length - if a quarter is longer, its revenue will be larger, all else being equal.

Yet, a sum is vulnerable to missing data points, which is a common situation when working with alternative data. The mean_times_days method takes the mean of all data points over each period, and multiplies this by the number of days in each period. This is effectively the same as taking a sum over each period, where the missing values have been imputed with the mean value.

Reference: DSL documentation for resample()

Aggregate change and period-to-date change

Finally, we've added new agg_change() and agg_relative_change() functions that combine aggregation and absolute / relative change calculations:

  • More convenient & concise syntax for one of the most common transformations
  • Support for all the company fiscal period frequencies mentioned above, including the FQ/FS smart frequency.
  • Native period-to-date support to calculate QTD / HTD / YTD change for the latest period even if there is only partial data.
  • Advanced parameters: control whether to include the first / last periods with partial data, and the minimum number of points required in each period.
# New

signal.agg_change('FQ/FS', 'sum', years=1)           # Sum over quarters/semi-annuals + absolute YoY
signal.agg_relative_change('FQ/FS', 'sum', years=1)  # Sum over quarters/semi-annuals + relative YoY

# Include period-to-date (QTD/HTD) YoY, if the signal has data for the first 14 days in the period
signal.agg_change('FQ/FS', 'sum', years=1, last_period_days=14)

# Include period-to-date (QTD/HTD) YoY, if the signal has data for the first 50% of days in the period
signal.agg_change('FQ/FS', 'sum', years=1, last_period_fraction=0.5)

# Include year-to-date (YTD) YoY, if the signal has data for the first 10% of days in the year
signal.agg_change('FY', 'sum', years=1, last_period_fraction=0.1)


# Old - verbose, no period-to-date support

signal.aggregate_over(fiscal_calendar(), 'sum').change(years=1)
signal.aggregate_over(fiscal_calendar(), 'sum').relative_change(years=1)

📘

Period-to-date calculation details

In the example above, where period-to-date change is calculated only if there is data for the first 14 days in the period - this does not mean that your alternative data signal must have data points on all 14 days.

Instead, we look at your alternative data time series for the last available data point, and count the number of days between the last period's start date and this last data point. If this is longer than the last_period_days or last_period_fraction, then we include the last period and apply period-to-date calculation.

Example: if the last period is 1 Jan 2023 - 31 Mar 2023, and there are weekly data points on 1 Jan, 8 Jan, and 15 Jan. There are 15 days between the period start and the last data point (15 Jan). Hence, this period is eligible for period-to-date calculation.

The period-to-date calculation looks at the percent of days for which there is data in the current period. It then finds the comparable period (e.g. a year ago for YoY change), and determines the comparable period-to-date range. Finally, it aggregates your alternative data time series across the current and comparable period-to-date ranges, and calculates the change.

Example: following the same example, there is data covering 15 days or 16.7% (15/90 days) of the current period of 1 Jan 2023 - 31 Mar 2023. If calculating year-on-year change, the comparable period-to-date range is the first 16.7% of days in the year-ago period of 1 Jan 2022 - 31 Mar 2022.

Reference: DSL documentation for agg_change() and agg_relative_change()