Integration with Dagster
Dagster is a popular open-source data pipeline orchestrator. Dagster Cloud is a fully managed service for Dagster. This guide demonstrates how to setup Cube and Dagster to work together so that Dagster can push changes from upstream data sources to Cube via the Orchestration API.Resources
In Dagster, each workflow is represented by jobs, Python functions decorated with a@job decorator. Jobs include calls to ops, Python functions decorated
with an @op decorator. Ops represent distinct pieces of work executed within a
job. They can perform various jobs: poll for some precondition, perform
extract-load-transform (ETL), or trigger external systems like Cube.
Integration between Cube and Dagster is enabled by the
dagster_cube package.
Cube and Dagster integration package was originally contributed by
Olivier Dupuis, founder of
discursus.io, for which we’re very grateful.
CubeResource class:
- For querying Cube via the
/v1/loadendpoint of the REST API. - For triggering pre-aggregation builds via the
/v1/pre-aggregations/jobsendpoint of the Orchestration API.
Installation
Install Dagster. Create a new directory:Configuration
Create a new file namedcube.py with the following contents:
make_request method for the load endpoint accepts a Cube
query via the query option and the make_request method for the
pre-aggregations/jobs endpoint accepts a pre-aggregation selector via the
selector option.