Bring Your Own Model - Cube Documentation

Available on the Enterprise plan.

Bring Your Own Model (BYOM) lets you connect your own LLM provider to power AI agents in Cube, instead of using the built-in models. This gives you full control over which models your agents use, where your data is processed, and how you manage AI costs.

Supported providers

Provider	Chat models	Embedding models
Anthropic	Yes	No
OpenAI	Yes	Yes
AWS Bedrock	Yes	Yes
GCP Vertex AI	Yes	No
Databricks	Yes	No
Snowflake Cortex	Yes	No

Configuration

Step 1: Add a model

Before assigning a BYOM model to an agent, you need to register it in the admin panel:

Navigate to Admin > Models
Click Add Model
Provide a name for the model
Select the model type (LLM or Embedding)
Choose a provider and model
Enter the required credentials for the provider

Step 2: Assign the model to an agent

Once a model is registered, reference it in the agents YAML configuration by name or ID:

agents:
  - name: sales-analyst
    llm:
      byom:
        name: "my-anthropic-model"
    embedding_llm:
      byom:
        name: "my-bedrock-embeddings"

Each agent can use a different model. If no BYOM model is specified, the agent uses the built-in default.

Switching embedding models for an agent means existing memories stored with the previous embedding model will not be compatible. Memories are tied to the embedding model that created them.

Network configuration

When using BYOM, Cube connects to your model provider from its control plane. If your provider requires IP allowlisting, ensure the Cube outbound IP addresses are added to your allowlist. For agents running in dedicated regions, additional per-region IP addresses may also need to be allowlisted.

Billing

When using a BYOM model, Cube AI tokens are not consumed. You are billed directly by your model provider based on their pricing. This means:

No Cube token quota is deducted for BYOM chat requests
No token usage is tracked in the AI Tokens Usage dashboard for BYOM requests
Per-seat token grants and token packages do not apply

See AI Tokens for details on how token billing works with built-in models.

Provider-specific notes

Anthropic

Supports extended thinking mode for compatible models. Configure this in the model settings when creating the model.

AWS Bedrock

Credentials are optional — if left empty, the default AWS credential chain is used (e.g., workload identity)
Supports assume-role configuration for cross-account access
Supports inference profiles

GCP Vertex AI

Requires a service account JSON key for authentication.

Databricks

Requires a workspace URL and access token.

Snowflake Cortex

Supports two authentication methods:

JWT authentication
Key-pair authentication (requires an encrypted PKCS#8 PEM private key)

Troubleshooting

Rate limit errors

If you see rate limit errors, the limits are enforced by your model provider, not by Cube. Check your provider’s rate limits and usage quotas.

Authentication errors

Verify that the API key or credentials configured for the model are valid and have the necessary permissions.

Model not found

Ensure the model ID configured in Cube matches a valid model offered by your provider. Model availability may vary by region.

​Supported providers

​Configuration

​Step 1: Add a model

​Step 2: Assign the model to an agent

​Network configuration

​Billing

​Provider-specific notes

​Anthropic

​AWS Bedrock

​GCP Vertex AI

​Databricks

​Snowflake Cortex

​Troubleshooting

​Rate limit errors

​Authentication errors

​Model not found

Supported providers

Configuration

Step 1: Add a model

Step 2: Assign the model to an agent

Network configuration

Billing

Provider-specific notes

Anthropic

AWS Bedrock

GCP Vertex AI

Databricks

Snowflake Cortex

Troubleshooting

Rate limit errors

Authentication errors

Model not found