Exporting via Exabel SDK

Export data in production pipelines

The Exabel SDK provides an easy way to programmatically export data in bulk, and is recommended for use in production data pipelines.

Currently, a Python command line script is provided, which takes in a SQL query and produces data files in a variety of formats.

Please let us know at [email protected] if you would like to see other methods or formats for export!

Installation

pip install exabel_data_sdk

Or download from PyPI. The SDK requires Python 3.7 or later.

Command line use

You can then execute an export request by running for example:

python -m exabel_data_sdk.scripts.export_data --filename data.csv --format csv \
    --query "SELECT Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'"

The first time you run the script a web browser window is opened where you are asked to provide user name and password. The script stores an access token in ~/.exabel, so you will not be asked again until the token expires.

The following are required arguments:

  • --query: your export query - see below for examples
  • --filename: filename to be written
  • --format: export file format - options are csv, excel (.xlsx format), pickle (pickled pandas DataFrame), json, feather, parquet

Python / Jupyter use

You can also execute the same export request from Python, for example within a Jupyter notebook:

from exabel_data_sdk.client.api.export_api import ExportApi
from exabel_data_sdk.query.column import Column
from exabel_data_sdk.query.dashboard import Dashboard
from exabel_data_sdk.query.signals import Signals

export_api = ExportApi()
series = export_api.signal_query("Random_Signal", "AAPL US")
print(series)

# Alternatively, a raw SQL statement can be executed directly:
frame = export_api.run_query("SELECT time, Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'")
print(frame)

The authentication is the same as when running the shell script, as described above, using a web browser window for authentication the first time.

Here is an example of a Jupyter notebook illustrating the usage.

Export signals

The basic query for exporting a signal is as follows:

SELECT Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'

The examples on this page use the signal Random_Signal which simply produces a random time series, because this is a signal that is available to everyone to export. Substitute this signal with a more interesting one for which you have export permission.

Export metadata

The above query returns a file with just a single time series and no metadata, which may not be so useful. At the very least, you may want to know the timestamp of each data point, which you can get by including the time column.

SELECT time, Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
export_api.signal_query("Random_Signal", "AAPL US")

This query produces a file with two columns of data: one containing a series of timestamps and one containing the value of the signal Random_Signal at each corresponding timestamp.

The table below shows all the metadata columns you can request. Most of them are metadata about the company / entity for which you request data, and are therefore most useful when you request data for multiple entities in a single query.

MetadataDescription
timeThe timestamp of each data point
versionThe point-in-time date (or "version") each data point corresponds to. This is not the known-time of the data point. Only useful when evaluating signals at a point-in-time.
labelFor signals that return multiple time series, this will contain the label for each time series
nameThe entity name, e.g. Apple, Inc.
bloomberg_tickerBloomberg ticker of the entity, e.g. "AAPL US"
factset_idFactSet Identifer for the entity
isinThe ISIN (International Securities Identification Number) of the security - only available if you have an ISIN license, please contact us at [email protected] to enable this
micMIC (Market Identifier Code) of the exchange where a security is traded
tickerTicker of the security
exabel_idExabel ID for the entity
resource_nameExabel resource name of the entity, of the format entityTypes/company/entities/xyz

Exporting data for multiple companies

If you want to export data for multiple companies, this can be specified in either of the following ways:

SELECT name, time, Random_Signal FROM signals
WHERE bloomberg_ticker='AAPL US' OR bloomberg_ticker='MSFT US'

SELECT name, time, Random_Signal FROM signals WHERE bloomberg_ticker IN ('AAPL US', 'MSFT US')
export_api.signal_query("Random_Signal", ["AAPL US", "MSFT US"])

In this case you should include name or another identifier in order to be able to distinguish the data corresponding to the different companies.

You can also specify the entities with a tag, as follows:

SELECT name, time, Random_Signal FROM signals WHERE has_tag('graph:tag:user:<...>')
export_api.signal_query("Random_Signal", tag="graph:tag:user:<...>")

Specifying a time interval

By default the exported data covers a time interval starting 1 January 2000 and ending three years into the future. If you need another time interval you can specify constraints on the time variable as follows:

SELECT time, Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
AND time >= '1980-01-01' AND time < '1990-01-01'
export_api.signal_query("Random_Signal", "AAPL US", start_time="1980-01-01", end_time="1990-01-01")

Specifying a point-in-time

By default, the signals are evaluated with the latest data. You may choose to evaluate and export signals at a given point-in-time date by specifying version:

SELECT time, Your_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
AND version = '2020-01-01'
export_api.signal_query("Your_Signal", "AAPL US", version='2020-01-01')

This evaluates the "version" of the signal at that specified date. During signal evaluation, only data points with a known-time ≤ version will be used.

You may also evaluate the signals at multiple point-in-time dates (versions):

SELECT version, time, Your_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
AND version IN ('2020-01-01', '2020-02-01')
export_api.signal_query("Your_Signal", "AAPL US", version=['2020-01-01', '2020-02-01'])

When using the above query, the returned data file will be in long-form, with a set of rows for each version. You should include version in the SELECT clause so as to identify the version for each data point.

Export dashboards

The syntax for exporting dashboards is:

SELECT * FROM dashboard WHERE dashboard_id = 'dashboard:dashboard:1'
query = Dashboard.query(1)
export_api.run_query(query)

Exporting specific columns

It is also possible to only export certain columns in a dashboard by specifying their labels:

SELECT 'Actual Sales', 'Estimated Sales' FROM dashboard WHERE dashboard_id = 'dashboard:dashboard:1'
query = Dashboard.query(1, columns=["Actual Sales", "Estimated Sales"])
export_api.run_query(query)

Dashboards with multiple widgets

For dashboards with multiple widgets, only the first is exported by default. If you want to export another widget, you can specify the widget_id as:

SELECT * FROM dashboard WHERE dashboard_id = 'dashboard:dashboard:1'
AND widget_id = 'dashboard:widget:2'
query = Dashboard.query(1, widget=2)
export_api.run_query(query)

The first widget has ID dashboard:widget:1, and they are numbered sequentially.

Numeric IDs are also supported, so you can also write:

SELECT * FROM dashboard WHERE dashboard_id = 1 AND widget_id = 2
query = Dashboard.query(1, widget=2)
export_api.run_query(query)

Column filtering

It is possible to filter a dashboard by one or more columns. A column is identified by its display name. Alternatively, if the dashboard has multiple columns with the same display name, a column can be identified by its column index using column_index=index.

It is only possible to filter columns with numeric or date values.

Supported operators are:

  • <: Less than.
  • >: Larger than.

In addition, there are two functions to filter columns containing dates relative to the current date:

  • days_in_future(<column_name>, int) - The date must at most fall these many days in the future.
  • days_in_past(<column_name>, int) - The date must at most fall these many days in the past.
SELECT * FROM dashboard WHERE dashboard_id = 1 AND 'column 1' > 23 AND 'column_index=3' < 0.3 AND days_in_past('column 4', 7)

Tag filtering

A tag filter can be provided to filter the dashboard to the entities contained in the tag filters.

SELECT * FROM dashboard WHERE dashboard_id = 1 AND (has_tag('graph:tag:1') OR has_tag('graph:tag:2'))

Ordering

A dashboard is by default ordered by the entity display name. Specify a column display name to order by another column (Use column_index=index if there are multiple columns with the same display name).

SELECT * FROM dashboard WHERE dashboard_id = 1 ORDER BY 'column 3' DESC

Export prediction model results

Prediction model results are available as derived signals, using the model_by_id() and backtests() DSL functions.

📘

Predictions vs backtests - what's the difference?

Model predictions are made using the final trained version of the model, incorporating all the historical training data over time.

Model backtests are generated during the backtesting process while the model is being trained.

You will need the ID of your prediction model, which you can find from your browser URL when opening the model. You may also specify a particular model run as the 2nd parameter in the DSL function. If you have activated a particular model run for live predictions, you may skip the model run parameter - the activated run will be used by default.

For example, if you have opened a prediction model run in the Exabel app, your browser URL may look like this, telling you that the model ID is 2750 and the model run is 1:

<https://app.exabel.com/models/signal:model:2729/runs/1/overview>

Exporting model predictions

To retrieve model predictions from the final trained model:

-- Predictions from model 1234, with activated model run
SELECT time, 'model_by_id(1234)' FROM signals WHERE bloomberg_ticker='AAPL US'

-- Predictions from model 1234, run 1
SELECT time, 'model_by_id(1234, 1)' FROM signals WHERE bloomberg_ticker='AAPL US'
# Predictions from model 1234, with activated model run
export_api.signal_query(Column("Prediction", "model_by_id(1234)"), "AAPL US")

# Predictions from model 1234, run 1
export_api.signal_query(Column("Prediction", "model_by_id(1234, 1)"), "AAPL US")

🚧

In-line expressions must be enclosed in quotes

The model_by_id() expression above is enclosed in quotes, as it is an in-line DSL expression.

If you had created a derived signal with the model_by_id() expression, you could instead use the name of that derived signal without enclosing in quotes.

As always, you may export predictions for multiple companies by specifying a tag - you will probably want to specify the tag defining the model universe:

SELECT time, 'model_by_id(1234)' FROM signals WHERE has_tag('graph:tag:user:<...>')
export_api.signal_query(Column("Prediction", "model_by_id(1234)"), tag="graph:tag:user:<...>")

Exporting model backtests

To retrieve model backtests:

-- Backtests from model 1234, with activated model run
SELECT time, 'backtests(1234)' FROM signals WHERE bloomberg_ticker='AAPL US'

-- Backtests from model 1234, run 1
SELECT time, 'backtests(1234, 1)' FROM signals WHERE bloomberg_ticker='AAPL US'
# Backtests from model 1234, with activated model run
export_api.signal_query(Column("Backtests", "backtests(1234)"), "AAPL US")

# Backtests from model 1234, run 1
export_api.signal_query(Column("Backtests", "backtests(1234, 1)"), "AAPL US")

You may also export both predictions and backtests in one single export:

SELECT time, 'backtests(1234)', 'model_by_id(1234)' FROM signals WHERE bloomberg_ticker='AAPL US'
columns = [Column("Prediction", "model_by_id(1234)"), Column("Backtests", "backtests(1234)")]
export_api.signal_query(columns, "AAPL US")

Export alpha test & portfolio strategy allocations

Alpha test and portfolio strategy allocations are available as derived signals, using the allocations() DSL function.

👍

Live portfolio strategies

If live tracking has been enabled on a portfolio strategy run, the allocations export will include ongoing live allocations.

You will need to specify the alpha analysis ID as the analysis parameter to the function.

For example, if you have opened a portfolio strategy backtest in the Exabel app, your browser URL may look like this, telling you that the alpha analysis ID is 17131:

<https://app.exabel.com/portfolio-strategies/signal:portfolioStrategy:1520/runs/signal:alphaAnalysis:17131>

If you have opened an alpha test result in the Exabel app, your browser URL may look like this, telling you that the alpha analysis ID is 21239:

<https://app.exabel.com/signals/alpha/results/signal:alphaAnalysis:21239>

To retrieve the full history of portfolio allocations:

SELECT name, bloomberg_ticker, time, 'allocations(analysis=1234)' FROM signals
export_api.signal_query(Column("Allocation", "allocations(analysis=1234)"), identifier=[Signals.NAME, Signals.BLOOMBERG_TICKER])

To retrieve portfolio allocations for a specific stock:

SELECT name, bloomberg_ticker, time, 'allocations(analysis=1234)' FROM signals WHERE bloomberg_ticker='AAPL US'
export_api.signal_query(Column("Prediction", "allocations(analysis=1234)"), "AAPL US")

To retrieve portfolio allocations in a specific time range:

SELECT name, bloomberg_ticker, time, 'allocations(analysis=1234)' FROM signals WHERE time >= '2020-01-01' AND time < '2021-12-31'
export_api.signal_query(Column("Allocation", "allocations(analysis=1234)"), identifier=[Signals.NAME, Signals.BLOOMBERG_TICKER], start_time="2020-01-01", end_time="2021-12-31")