Integration with Prefect
Prefect is a popular open-source orchestrator for data-intensive workflows. Prefect Cloud is a fully managed service for Prefect. This guide demonstrates how to setup Cube and Prefect to work together so that Prefect can push changes from upstream data sources to Cube via the Orchestration API.Tasks
In Prefect, each workflow is represented by flows, Python functions decorated with a@flow decorator. Flows include calls to tasks, Python functions
decorated with a @task decorator, as well as to child flows. Tasks represent
distinct pieces of work executed within a flow. They can perform various jobs:
poll for some precondition, perform extract-load-transform (ETL), or trigger
external systems like Cube.
Integration between Cube and Prefect is enabled by the
prefect-cubejs package.
Cube and Prefect integration package was originally contributed by
Alessandro Lollo, Data Engineering Manager
at Cloud Academy
(case study), for which
we’re very grateful.
run_queryfor querying Cube via the/v1/loadendpoint of the REST API.build_pre_aggregationsfor triggering pre-aggregation builds via the/v1/pre-aggregations/jobsendpoint of the Orchestration API.
Installation
Install Prefect. Create a new directory:Configuration
Create a new workflow namedcube_query.py with the following contents. As you
can see, the run_query task accepts a Cube query via the query option.
cube_build.py with the following contents. As you
can see, the build_pre_aggregations task accepts a pre-aggregation selector
via the selector option.