Exporting via Exabel SDK
Export data in production pipelines
The Exabel SDK provides an easy way to programmatically export data in bulk, and is recommended for use in production data pipelines.
Currently, a Python command line script is provided, which takes in a SQL query and produces data files in a variety of formats.
Please let us know at [email protected] if you would like to see other methods or formats for export!
Installation
pip install exabel_data_sdk
Or download from PyPI. The SDK requires Python 3.7 or later.
Command line use
You can then execute an export request by running for example:
python -m exabel_data_sdk.scripts.export_data --filename data.csv --format csv \
--query "SELECT Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'"
The first time you run the script a web browser window is opened where you are asked to provide user name and password. The script stores an access token in ~/.exabel
, so you will not be asked again until the token expires.
The following are required arguments:
--query
: your export query - see below for examples--filename
: filename to be written--format
: export file format - options arecsv
,excel
(.xlsx
format),pickle
(pickled pandas DataFrame),json
,feather
,parquet
Authenticating using an API-key
It is also possible to authenticate using an API-key provided by Exabel, by using the --api-key
argument:
python -m exabel_data_sdk.scripts.export_data --filename data.csv --format csv \
--query "SELECT Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'" \
--api-key "my_api_key"
Python / Jupyter use
You can also execute the same export request from Python, for example within a Jupyter notebook:
from exabel_data_sdk.client.api.export_api import ExportApi
from exabel_data_sdk.query.column import Column
from exabel_data_sdk.query.dashboard import Dashboard
from exabel_data_sdk.query.signals import Signals
export_api = ExportApi()
series = export_api.signal_query("Random_Signal", "AAPL US")
print(series)
# Alternatively, a raw SQL statement can be executed directly:
frame = export_api.run_query("SELECT time, Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'")
print(frame)
The authentication is the same as when running the shell script, as described above, using a web browser window for authentication the first time.
Here is an example of a Jupyter notebook illustrating the usage.
Authenticating using an API-key
It is also possible to authenticate using an API-key provided by Exabel, by using the ExportApi.from_api_key
factory method:
export_api = ExportApi.from_api_key("my_api_key")
series = export_api.signal_query("Random_Signal", "AAPL US")
print(series)
Export signals
The basic query for exporting a signal is as follows:
SELECT Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
The examples on this page use the signal Random_Signal
which simply produces a random time series, because this is a signal that is available to everyone to export. Substitute this signal with a more interesting one for which you have export permission.
Export metadata
The above query returns a file with just a single time series and no metadata, which may not be so useful. At the very least, you may want to know the timestamp of each data point, which you can get by including the time
column.
SELECT time, Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
export_api.signal_query("Random_Signal", "AAPL US")
This query produces a file with two columns of data: one containing a series of timestamps and one containing the value of the signal Random_Signal
at each corresponding timestamp.
The table below shows all the metadata columns you can request. Most of them are metadata about the company / entity for which you request data, and are therefore most useful when you request data for multiple entities in a single query.
Metadata | Description |
---|---|
time | The timestamp of each data point |
version | The point-in-time date (or "version") each data point corresponds to. This is not the known-time of the data point. Only useful when evaluating signals at a point-in-time. |
label | For signals that return multiple time series, this will contain the label for each time series |
name | The entity name, e.g. Apple, Inc. |
bloomberg_ticker | Bloomberg ticker of the entity, e.g. "AAPL US" |
factset_id | FactSet Identifer for the entity |
mic | MIC (Market Identifier Code) of the exchange where a security is traded |
ticker | Ticker of the security |
exabel_id | Exabel ID for the entity |
resource_name | Exabel resource name of the entity, of the format entityTypes/company/entities/xyz |
Exporting data for multiple companies
If you want to export data for multiple companies, this can be specified in either of the following ways:
SELECT name, time, Random_Signal FROM signals
WHERE bloomberg_ticker='AAPL US' OR bloomberg_ticker='MSFT US'
SELECT name, time, Random_Signal FROM signals WHERE bloomberg_ticker IN ('AAPL US', 'MSFT US')
export_api.signal_query("Random_Signal", ["AAPL US", "MSFT US"])
In this case you should include name or another identifier in order to be able to distinguish the data corresponding to the different companies.
You can also specify the entities with a tag, as follows:
SELECT name, time, Random_Signal FROM signals WHERE has_tag('graph:tag:user:<...>')
export_api.signal_query("Random_Signal", tag="graph:tag:user:<...>")
Specifying a time interval
By default the exported data covers a time interval starting 1 January 2000 and ending three years into the future. If you need another time interval you can specify constraints on the time variable as follows:
SELECT time, Random_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
AND time >= '1980-01-01' AND time < '1990-01-01'
export_api.signal_query("Random_Signal", "AAPL US", start_time="1980-01-01", end_time="1990-01-01")
Specifying a point-in-time
By default, the signals are evaluated with the latest data. You may choose to evaluate and export signals at a given point-in-time date by specifying version
:
SELECT time, Your_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
AND version = '2020-01-01'
export_api.signal_query("Your_Signal", "AAPL US", version='2020-01-01')
This evaluates the "version" of the signal at that specified date. During signal evaluation, only data points with a known-time ≤ version will be used.
You may also evaluate the signals at multiple point-in-time dates (versions):
SELECT version, time, Your_Signal FROM signals WHERE bloomberg_ticker='AAPL US'
AND version IN ('2020-01-01', '2020-02-01')
export_api.signal_query("Your_Signal", "AAPL US", version=['2020-01-01', '2020-02-01'])
When using the above query, the returned data file will be in long-form, with a set of rows for each version
. You should include version
in the SELECT clause so as to identify the version for each data point.
Export dashboards
Only tables in dashboards can be exported as widgets. To export chart widgets, you have to export the signals of the chart (see above).
The syntax for exporting dashboards is:
SELECT * FROM dashboard WHERE dashboard_id = 'dashboard:dashboard:1'
query = Dashboard.query(1)
export_api.run_query(query)
Exporting specific columns
It is also possible to only export certain columns in a dashboard by specifying their labels:
SELECT 'Actual Sales', 'Estimated Sales' FROM dashboard WHERE dashboard_id = 'dashboard:dashboard:1'
query = Dashboard.query(1, columns=["Actual Sales", "Estimated Sales"])
export_api.run_query(query)
Dashboards with multiple widgets
For dashboards with multiple widgets, only the first is exported by default. If you want to export another widget, you can specify the widget_id
as:
SELECT * FROM dashboard WHERE dashboard_id = 'dashboard:dashboard:1'
AND widget_id = 'dashboard:widget:2'
query = Dashboard.query(1, widget=2)
export_api.run_query(query)
The first widget has ID dashboard:widget:1
, and they are numbered sequentially.
Numeric IDs are also supported, so you can also write:
SELECT * FROM dashboard WHERE dashboard_id = 1 AND widget_id = 2
query = Dashboard.query(1, widget=2)
export_api.run_query(query)
Column filtering
It is possible to filter a dashboard by one or more columns. A column is identified by its display name. Alternatively, if the dashboard has multiple columns with the same display name, a column can be identified by its column index using column_index=index
.
It is only possible to filter columns with numeric or date values.
Supported operators are:
<
: Less than.>
: Larger than.
In addition, there are two functions to filter columns containing dates relative to the current date:
days_in_future(<column_name>, int)
- The date must at most fall these many days in the future.days_in_past(<column_name>, int)
- The date must at most fall these many days in the past.
SELECT * FROM dashboard WHERE dashboard_id = 1 AND 'column 1' > 23 AND 'column_index=3' < 0.3 AND days_in_past('column 4', 7)
Row filtering
Rows can be filtered by tag id or Exabel entity ID, or a combination of these. Exabel IDs must be listed in an IN
clause:
SELECT * FROM dashboard WHERE dashboard_id = 1
AND (has_tag('graph:tag:1') OR has_tag('graph:tag:2')
OR exabel_id IN ('graph:entity::company::F_000C7F-E','graph:entity::company::F_0FPWZZ-E'))
Ordering
A dashboard is by default ordered by the entity display name. Specify a column display name to order by another column (Use column_index=index
if there are multiple columns with the same display name).
SELECT * FROM dashboard WHERE dashboard_id = 1 ORDER BY 'column 3' DESC
Export financial models
The basic syntax for exporting financial models is:
SELECT * FROM financial_models WHERE financial_model_id = 123 AND bloomberg_ticker='AAPL US'
The company identifier can be one of the following: bloomberg_ticker
, factset_id
, isin
, mic
, ticker
, exabel_id
or resource_name
. See description above for explanation of each of these.
Exporting data for multiple companies
Exporting for multiple companies is supported if the output format is excel
. Data for each company will be written to a separate excel sheet.
Specify either a list of company identifiers or a tag:
SELECT * FROM financial_models WHERE financial_model_id = 123 AND bloomberg_ticker IN ('AAPL US', 'MSFT US')
SELECT * FROM financial_models WHERE financial_model_id = 123 AND has_tag('graph:tag:1')
Specifying a time interval
By default, the exported data covers a time interval as configured in the financial model. If you need another time interval you can specify constraints on the time variable as follows:
SELECT * FROM financial_models WHERE financial_model_id = 123 AND bloomberg_ticker='AAPL US' AND time >= '2015-01-01' AND time < '2023-01-01'
Export prediction model results
Prediction model results are available as derived signals, using the model_predictions()
and model_backtests()
DSL functions.
Predictions vs backtests - what's the difference?
Model predictions are calculated using the current values for all input signals,
whereas model backtests were produced using the values of the input signals as they were at the time the backtests were run.The model predictions are effectively in-sample predictions for the part of the data that was part of the training, and out-of-sample predictions for new data (that wasn't seen during the training).
The backtests are done using either walk-forward backtesting or 5-fold cross validation, depending on the model configuration. See Prediction Models - Setup for more details on how this works. What is returned by the
model_backtests()
function are the out-of-sample predictions generated during that backtesting process.
You will need the ID of your prediction model, which you can find from your browser URL when opening the model. You may also specify a particular model run as the 2nd parameter in the DSL function. If you have activated a particular model run for live predictions, you may skip the model run parameter - the activated run will be used by default.
For example, if you have opened a prediction model run in the Exabel app, your browser URL may look like this, telling you that the model ID is 2750
and the model run is 1
:
<https://app.exabel.com/models/signal:model:2729/runs/1/overview
>
Exporting model predictions
To retrieve model predictions from the final trained model:
-- Predictions from model 1234, with activated model run
SELECT time, 'model_predictions(1234)' FROM signals WHERE bloomberg_ticker='AAPL US'
-- Predictions from model 1234, run 1
SELECT time, 'model_predictions(1234, 1)' FROM signals WHERE bloomberg_ticker='AAPL US'
# Predictions from model 1234, with activated model run
export_api.signal_query(Column("Prediction", "model_predictions(1234)"), "AAPL US")
# Predictions from model 1234, run 1
export_api.signal_query(Column("Prediction", "model_predictions(1234, 1)"), "AAPL US")
In-line expressions must be enclosed in quotes
The
model_predictions()
expression above is enclosed in quotes, as it is an in-line DSL expression.If you had saved a derived signal with the
model_predictions()
expression to your Library, you could instead use the name of that derived signal without enclosing in quotes.
As always, you may export predictions for multiple companies by specifying a tag - you will probably want to specify the tag defining the model universe:
SELECT time, 'model_predictions(1234)' FROM signals WHERE has_tag('graph:tag:user:<...>')
export_api.signal_query(Column("Prediction", "model_predictions(1234)"), tag="graph:tag:user:<...>")
Exporting model backtests
To retrieve model backtests:
-- Backtests from model 1234, with activated model run
SELECT time, 'model_backtests(1234)' FROM signals WHERE bloomberg_ticker='AAPL US'
-- Backtests from model 1234, run 1
SELECT time, 'model_backtests(1234, 1)' FROM signals WHERE bloomberg_ticker='AAPL US'
# Backtests from model 1234, with activated model run
export_api.signal_query(Column("Backtests", "backtests(1234)"), "AAPL US")
# Backtests from model 1234, run 1
export_api.signal_query(Column("Backtests", "backtests(1234, 1)"), "AAPL US")
You may also export both predictions and backtests in one single export:
SELECT time, 'model_backtests(1234)', 'model_predictions(1234)' FROM signals WHERE bloomberg_ticker='AAPL US'
columns = [Column("Prediction", "model_predictions(1234)"), Column("Backtests", "backtests(1234)")]
export_api.signal_query(columns, "AAPL US")
Export alpha test & portfolio strategy allocations
Alpha test and portfolio strategy allocations are available as derived signals, using the allocations()
DSL function.
Live portfolio strategies
If live tracking has been enabled on a portfolio strategy run, the allocations export will include ongoing live allocations.
You will need to specify the alpha analysis ID as the analysis
parameter to the function.
For example, if you have opened a portfolio strategy backtest in the Exabel app, your browser URL may look like this, telling you that the alpha analysis ID is 17131
:
<https://app.exabel.com/portfolio-strategies/signal:portfolioStrategy:1520/runs/signal:alphaAnalysis:17131
>
If you have opened an alpha test result in the Exabel app, your browser URL may look like this, telling you that the alpha analysis ID is 21239
:
<https://app.exabel.com/signals/alpha/results/signal:alphaAnalysis:21239
>
To retrieve the full history of portfolio allocations:
SELECT name, bloomberg_ticker, time, 'allocations(analysis=1234)' FROM signals
export_api.signal_query(Column("Allocation", "allocations(analysis=1234)"), identifier=[Signals.NAME, Signals.BLOOMBERG_TICKER])
To retrieve portfolio allocations for a specific stock:
SELECT name, bloomberg_ticker, time, 'allocations(analysis=1234)' FROM signals WHERE bloomberg_ticker='AAPL US'
export_api.signal_query(Column("Prediction", "allocations(analysis=1234)"), "AAPL US")
To retrieve portfolio allocations in a specific time range:
SELECT name, bloomberg_ticker, time, 'allocations(analysis=1234)' FROM signals WHERE time >= '2020-01-01' AND time < '2021-12-31'
export_api.signal_query(Column("Allocation", "allocations(analysis=1234)"), identifier=[Signals.NAME, Signals.BLOOMBERG_TICKER], start_time="2020-01-01", end_time="2021-12-31")
Updated 4 months ago