Deployment types
Cube Cloud provides you with three deployment types:
Development instance
Available for free, no credit card required. Your free trial is limited to 2
development instances and only 1,000 queries per day. Upgrade to
any paid plan to unlock all features.
Development instances are designed for development use cases only. This makes it
easy to get started with Cube Cloud quickly, and also allows you to build and
query pre-aggregations on-demand.
Development instances don’t have dedicated refresh workers
and, consequently, they do not refresh pre-aggregations on schedule.
Development instances do not provide high-availability nor do they guarantee
fast response times. Development instances also auto-suspend
after 30 minutes of inactivity, which can cause the first request after the instance
wakes up to take additional time to process. They also have limits
on the maximum number of queries per day and the maximum number of Cube Store
Workers. We strongly advise not using a development instance in a production
environment, it is for testing and learning about Cube only and will not deliver
a production-level experience for your users.
You can try a Cube Cloud development instance by
signing up for Cube Cloud to try it free
(no credit card required).
Production cluster
Production Clusters are designed to support high-availability production
workloads. It consists of several key components, including starting with 2 Cube
API instances, 1 Cube Refresh Worker and 2 Cube Store Routers - all of which run
on dedicated infrastructure. The cluster can automatically scale to meet the
needs of your workload by adding more components as necessary; check the page on
scalability to learn more.
Production multi-cluster
Production multi-cluster deployments are designed for demanding production
workloads, high-scalability, high-availaility, and large multi-tenancy
configurations, e.g., with more than 100 tenants.
It provides you with two options:
- Scale the number of production cluster deployments
serving your workload, allowing to route requests over up to 10 production
clusters and up to 100 API instances.
- Optionally, scale the number of Cube Store routers, allowing for increased
Cube Store querying performance.
Each production cluster is billed separately, and all production clusters can
use auto-scaling to match demand.
Configuring production multi-cluster
To switch your Cube Cloud deployment to production multi-cluster, navigate to
Settings → General, select it under Type, and confirm
with ✓:
To set the number of production clusters within your production multi-cluster
deployment, navigate to Settings → Configuration and edit
Number of clusters.
Routing traffic between production clusters
Cube Cloud routes requests between multiple production clusters within a
production multi-cluster deployment based on context_to_app_id.
In most cases, it should return an identifier that does not change over time
for each tenant.
The following implementation will make sure that all requests from a
particular tenant are always routed to the same production cluster. This
approach ensures that only one production cluster keeps compiled data model
cache for each tenant and serves its requests. It allows to reduce the
footprint of the compiled data model cache on individual production clusters.
from cube import config
@config('context_to_app_id')
def context_to_app_id(ctx: dict) -> str:
return f"CUBE_APP_{ctx['securityContext']['tenant_id']}"
If your implementation of context_to_app_id returns identifiers that change
over time for each tenant, requests from one tenant would likely hit multiple
production clusters and you would not have the benefit of reduced memory footprint.
Also you might see 502 or timeout errors in case of different cluster nodes would return different context_to_app_id results for the same request.
Switching between deployment types
To switch a deployment’s type, go to the deployment’s Settings screen
and select from the available options: