Syntax
Entities within the data model (e.g., cubes, views, etc.) should be placed under themodel folder, follow naming
conventions, and be defined using a supported
syntax.
Folder structure
Data model files should be placed inside themodel folder. You can use the
schema_path configuration option to override the
folder name or the repository_factory configuration
option to dynamically define the folder name and
data model file contents.
It’s recommended to place each cube or view in a separate file, in model/cubes
and model/views folders, respectively. Example:
Model syntax
Cube supports two ways to define data model files: with YAML or JavaScript syntax. YAML data model files should have the.yml extension,
whereas JavaScript data model files should end with .js. You can mix YAML and
JavaScript files within a single data model.
See Cube style guide for more recommendations on syntax and structure.
Naming
Common rules apply to names of entities within the data model. All names must:- Start with a letter.
- Consist of letters, numbers, and underscore (
_) symbols only. - Not be a reserved keyword in Python, e.g.,
from,return, oryield. - When using the DAX API, not clash with the names of columns in date hierarchies.
orders,stripe_invoices, orbase_payments(cubes)opportunities,cloud_accounts, orarr(views)count,avg_price, ortotal_amount_shipped(measures)name,is_shipped, orcreated_at(dimensions)main,orders_by_status, orlambda_invoices(pre-aggregations)
SQL expressions
When defining cubes, you would often provide SQL snippets insql and
sql_table parameters.
Provided SQL expressions should match your database SQL dialect, e.g.,
to aggregate a list of strings, you would probably pick the LISTAGG
function in Snowflake and the STRING_AGG
function in BigQuery.
Currently, Cube does not wrap your SQL snippets in parentheses during SQL
generation. In case of non-trivial snippets, this may lead to unexpected results.
Please track this issue.
User-defined functions
If you have created a user-defined function (UDF) in your data source, you can use it in thesql parameter as well.
Case sensitivity
If your database uses case-sensitive identifiers, make sure to properly quote table and column names. For example, here’s how you can reference a Postgres table that contains uppercase letters in its name:References
To write versatile data models, it is important to be able to reference members of cubes and views, such as measures or dimensions, as well as table columns. Cube supports the following syntax for references.column
Most commonly, you would use bare column names in the sql parameter of
measures or dimensions. In the following example, name references
the respective column of the users table.
{member}
When defining measures and dimensions, you can also reference other members
of the same cube by wrapping their names in curly braces. In the following
example, the full_name dimension references name and surname dimensions
of the same cube.
{time_dimension.granularity}
When referencing a time dimension, you can specificy a
granularity to refer to a time value with that specific granularity. It can be
one of the default granularities (e.g., year or
week) or a custom granularity:
{cube}.column, {cube.member}
You can qualify column and member names with the name of a cube to remove
the ambiguity when cubes are joined and reference members of other cubes.
{cube1.cube2.member}
You can also qualify member names with more than one cube name, separated by a dot, to
provide a join path and remove the ambiguity of join resolution. Join
paths can be used in calculated members, views,
and pre-aggregation definitions.
In the following example, four cubes are joined together, forming a diamond
subgraph in the join tree: a joins to b and c, and both
b and c join to d. As you can see, join paths in the d_via_b and d_via_c
dimensions of a are then used to specify which intermediate cube to use when resolving
the dimension from d.
{CUBE} variable
You can use a handy {CUBE} context variable
(mind the uppercase) to reference the current cube so you don’t have to
repeat the its name over and over. It works both for column and member
references.
{users.name} dimension. Referencing another cube in the
dimension definition instructs Cube to make an implicit join to that cube.
For example, using the data model above, we can make the following query:
{cube.sql()} function
When defining a cube, you can reference the sql parameter of another cube,
effectively reusing the SQL query it’s defined on. This is particularly useful
when defining polymorphic cubes or using data blending.
Consider the following data model:
dogs.count, Cube will generate the following SQL:
Curly braces and escaping
As you can see in the examples above, within SQL expressions, curly braces are used to reference cubes and members. In YAML data models, use{reference}:
${reference} in JavaScript template
literals (mind the dollar sign):
Non-SQL references
Outside SQL expressions,column is not recognized
as a column name; it is rather recognized as a member name. It means that,
outside sql and sql_table parameters, you can skip the curly braces and
reference members by their names directly: member, cube_name.member, or
CUBE.member.
Context variables
In addition to theCUBE variable, you can also use a few more context
variables within your data model. They are
generally useful for two purposes: optimizing generated SQL queries and
defining dynamic data models.
Troubleshooting
Can't parse timestamp
Sometimes, you might come across the following error message: Can't parse timestamp: 2023-11-07T14:33:23.16.000.
It indicates that the data source was unable to recognize the value of a time
dimension as a timestamp. Please check that the SQL
expression of this time dimension evaluates to a TIMESTAMP type.
Check this recipe to see how you can work around
string values in time dimensions.