Data modeling with YAML, Jinja, and Python
Cube supports authoring dynamic data models using the Jinja templating language and Python. This allows de-duplicating common patterns in your data models as well as dynamically generating data models from a remote data source. Jinja is supported in all YAML data model files.YAML
It is recommended to default to YAML syntax because of its simplicity and readability.Folded and literal strings
Sometimes you might want to use multi-line strings in YAML-based data models, e.g., in parameters such assql or description. It is recommended to use literal
(|) string style in such cases as it preserves line breaks.
Jinja
Please check the Jinja documentation for details on Jinja syntax.Previewing YAML
You can preview the data model code after applying Jinja templates in the Data Model editor by clicking … → Jinja Preview on files that contain Jinja templates in the sidebar.Currently, there’s no way to preview the data model code in YAML after applying
Jinja templates in Cube Core. Please track this issue.
/v1/meta REST API endpoint.
Loops
Jinja supports looping over lists and dictionaries. In the following example, we loop over a list of nested properties and generate aLEFT JOIN UNNEST clause for each one: for each one:
Macros
Cube data models also support Jinja macros, which allow you to define reusable snippets of code. You can read more about macros in the Jinja documentation. In the following example, we define a macro calleddimension() which generates
a dimension definition in Cube. This macro is then invoked multiple times to
generate multiple dimensions:
sql
property:
Reusing macros across files
You can define macros in dedicated.jinja files and import them into your
data model files using Jinja’s import statement. This
is useful for sharing common patterns across multiple cubes and views.
Consider the following project structure:
.jinja file under the macros/ directory:
model/ directory.
Escaping unsafe strings
Auto-escaping of unsafe string values in Jinja templates is enabled by default. It means that any strings coming from Python might get wrapped in quotes, potentially breaking YAML syntax. You can work around that by using thesafe Jinja
filter with such string values:
cube_dbt package.
Python
Template context
You can use Python to declare functions that can be invoked and variables that can be referenced from within a Jinja template. These functions and variables must be defined inmodel/globals.py file and registered in the TemplateContext instance.
See the
TemplateContext reference for more details.load_data that supposedly loads
data from a remote API endpoint. We will then use the function to generate a data model:
@template.function decorator, we can
call it from within a Jinja template. In the following example, we’ll call the
load_data() function and use the result to generate a data model.
Imports
In themodel/globals.py file (or the cube.py configuration file), you can
import modules from the current directory. In the following example, we import a function
from the utils module and use it to populate a variable in the template context:
Dependencies
If you need to use dependencies in your dynamic data model (or yourcube.py
configuration file), you can list them in the requirements.txt file in the root
directory of your Cube deployment. They will be automatically installed with pip on
the startup.
cube package is available out of the box, it doesn’t need to be
listed in requirements.txt.cube_dbt
package useful. It provides a set of utilities that simplify
defining the data model in YAML based on dbt models.
If you need to use dependencies with native extensions, build a custom Docker
image.