> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cube.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Google BigQuery

> Grant a Google Cloud service account the right BigQuery IAM roles, then point Cube at your project, region, and key material.

## Prerequisites

<Info>
  In order to connect Google BigQuery to Cube, you need to provide service account
  credentials. Cube requires the service account to have **BigQuery Data Viewer**
  and **BigQuery Job User** roles enabled. If you plan to use pre-aggregations,
  the account will need the **BigQuery Data Editor** role instead of **BigQuery Data Viewer**.
  You can learn more about acquiring
  Google BigQuery credentials [here][bq-docs-getting-started].
  In Cube Cloud, you can authenticate with [OIDC workload identity
  federation][ref-oidc-gcp-bigquery] instead of a key file — the same roles
  apply to the impersonated service account.
</Info>

* The [Google Cloud Project ID][google-cloud-docs-projects] for the
  [BigQuery][bq] project
* A set of [Google Cloud service credentials][google-support-create-svc-account]
  which [allow access][bq-docs-getting-started] to the [BigQuery][bq] project
* The [Google Cloud region][bq-docs-regional-locations] for the [BigQuery][bq]
  project

## Setup

### Manual

Add the following to a `.env` file in your Cube project:

```dotenv theme={"dark"}
CUBEJS_DB_TYPE=bigquery
CUBEJS_DB_BQ_PROJECT_ID=my-bigquery-project-12345
CUBEJS_DB_BQ_KEY_FILE=/path/to/my/keyfile.json
```

You could also encode the key file using Base64 and set the result to
[`CUBEJS_DB_BQ_CREDENTIALS`](/reference/configuration/environment-variables#cubejs_db_bq_credentials):

```dotenv theme={"dark"}
CUBEJS_DB_BQ_CREDENTIALS=$(cat /path/to/my/keyfile.json | base64)
```

### Cube Cloud

<Info>
  In some cases you'll need to allow connections from your Cube Cloud deployment
  IP address to your database. You can copy the IP address from either the
  Database Setup step in deployment creation, or from **Settings →
  Configuration** in your deployment.
</Info>

The following fields are required when creating a BigQuery connection:

<Frame>
  <img src="https://ucarecdn.com/9fdfbd5d-5556-48b3-aa6e-fff0f24b30e9/" alt="Cube Cloud BigQuery Configuration Screen" />
</Frame>

#### OIDC workload identity federation

Instead of a service account key file, Cube Cloud deployments can
authenticate to BigQuery with [OIDC workload identity
federation][ref-oidc-gcp-bigquery]: a Workload Identity Federation provider
in your GCP project trusts Cube's OIDC issuer, and the driver authenticates
through the GCP default credential chain — no JSON key to provision or
rotate. Select **OIDC workload identity federation** in the connection
wizard, or set the equivalent environment variables:

```dotenv theme={"dark"}
CUBEJS_DB_TYPE=bigquery
CUBEJS_DB_BQ_PROJECT_ID=my-bigquery-project-12345
GCP_POOL_AUDIENCE=//iam.googleapis.com/projects/123456789/locations/global/workloadIdentityPools/cube-pool/providers/cube
GCP_SERVICE_ACCOUNT_EMAIL=cube-deployment@my-bigquery-project-12345.iam.gserviceaccount.com
```

`GCP_SERVICE_ACCOUNT_EMAIL` selects the service account Cube impersonates;
leave it unset to authenticate as the federated principal directly. See the
[GCP OIDC guide][ref-oidc-gcp-bigquery] for the Workload Identity Pool,
provider, and IAM setup.

Cube Cloud also supports connecting to data sources within private VPCs
if [single-tenant infrastructure][ref-dedicated-infra] is used. Check out the
[VPC connectivity guide][ref-cloud-conf-vpc] for details.

[ref-dedicated-infra]: /docs/deployment/cloud/infrastructure#dedicated-infrastructure

[ref-cloud-conf-vpc]: /docs/deployment/cloud/vpc

## Environment Variables

| Environment Variable                                                                                          | Description                                                                                                                                                                                     | Possible Values                                                         |    Required   |
| ------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | :-----------: |
| [`CUBEJS_DB_BQ_PROJECT_ID`](/reference/configuration/environment-variables#cubejs_db_bq_project_id)           | The Google BigQuery project ID to connect to                                                                                                                                                    | A valid Google BigQuery Project ID                                      |       ✅       |
| [`CUBEJS_DB_BQ_KEY_FILE`](/reference/configuration/environment-variables#cubejs_db_bq_key_file)               | The path to a JSON key file for connecting to Google BigQuery                                                                                                                                   | A valid Google BigQuery JSON key file                                   | ✅<sup>1</sup> |
| [`CUBEJS_DB_BQ_CREDENTIALS`](/reference/configuration/environment-variables#cubejs_db_bq_credentials)         | A Base64 encoded JSON key file for connecting to Google BigQuery                                                                                                                                | A valid Google BigQuery JSON key file encoded as a Base64 string        |       ❌       |
| [`CUBEJS_DB_BQ_LOCATION`](/reference/configuration/environment-variables#cubejs_db_bq_location)               | The Google BigQuery dataset location to connect to. Required if used with pre-aggregations outside of US. If not set then BQ driver will fail with `Dataset was not found in location US` error | [A valid Google BigQuery regional location][bq-docs-regional-locations] |       ⚠️      |
| [`CUBEJS_DB_EXPORT_BUCKET`](/reference/configuration/environment-variables#cubejs_db_export_bucket)           | The name of a bucket in cloud storage                                                                                                                                                           | A valid bucket name from cloud storage                                  |       ❌       |
| [`CUBEJS_DB_EXPORT_BUCKET_TYPE`](/reference/configuration/environment-variables#cubejs_db_export_bucket_type) | The cloud provider where the bucket is hosted                                                                                                                                                   | `gcp`                                                                   |       ❌       |
| [`CUBEJS_DB_MAX_POOL`](/reference/configuration/environment-variables#cubejs_db_max_pool)                     | The maximum number of concurrent database connections to pool. Default is `40`                                                                                                                  | A valid number                                                          |       ❌       |
| [`CUBEJS_CONCURRENCY`](/reference/configuration/environment-variables#cubejs_concurrency)                     | The number of [concurrent queries][ref-data-source-concurrency] to the data source                                                                                                              | A valid number                                                          |       ❌       |

<sup>1</sup> Not required when authenticating with [OIDC workload identity
federation][ref-oidc-gcp-bigquery] in Cube Cloud, or when providing the key
material via [`CUBEJS_DB_BQ_CREDENTIALS`](/reference/configuration/environment-variables#cubejs_db_bq_credentials).

[ref-data-source-concurrency]: /admin/connect-to-data/concurrency#data-source-concurrency

## Pre-Aggregation Feature Support

### count\_distinct\_approx

Measures of type
[`count_distinct_approx`][ref-schema-ref-types-formats-countdistinctapprox] can
be used in pre-aggregations when using Google BigQuery as a source database. To
learn more about Google BigQuery's support for approximate aggregate functions,
[click here][bq-docs-approx-agg-fns].

## Pre-Aggregation Build Strategies

<Info>
  To learn more about pre-aggregation build strategies, [head
  here][ref-caching-using-preaggs-build-strats].
</Info>

| Feature       | Works with read-only mode? | Is default? |
| ------------- | :------------------------: | :---------: |
| Batching      |              ❌             |      ✅      |
| Export Bucket |              ❌             |      ❌      |

By default, Google BigQuery uses [batching][self-preaggs-batching] to build
pre-aggregations.

### Batching

No extra configuration is required to configure batching for Google BigQuery.

### Export bucket

<Warning>
  BigQuery only supports using Google Cloud Storage for export buckets.
</Warning>

#### Google Cloud Storage

For [improved pre-aggregation performance with large
datasets][ref-caching-large-preaggs], enable export bucket functionality by
configuring Cube with the following environment variables:

<Info>
  When using an export bucket, remember to assign the **BigQuery Data Editor** and
  **Storage Object Admin** role to your BigQuery service account.
</Info>

```dotenv theme={"dark"}
CUBEJS_DB_EXPORT_BUCKET=export_data_58148478376
CUBEJS_DB_EXPORT_BUCKET_TYPE=gcp
```

## SSL

Cube does not require any additional configuration to enable SSL as Google
BigQuery connections are made over HTTPS.

[bq]: https://cloud.google.com/bigquery

[bq-docs-getting-started]: https://cloud.google.com/docs/authentication/getting-started

[bq-docs-regional-locations]: https://cloud.google.com/bigquery/docs/locations#regional-locations

[bq-docs-approx-agg-fns]: https://cloud.google.com/bigquery/docs/reference/standard-sql/approximate_aggregate_functions

[google-cloud-docs-projects]: https://cloud.google.com/resource-manager/docs/creating-managing-projects#before_you_begin

[google-support-create-svc-account]: https://support.google.com/a/answer/7378726?hl=en

[ref-caching-large-preaggs]: /docs/pre-aggregations/using-pre-aggregations#export-bucket

[ref-caching-using-preaggs-build-strats]: /docs/pre-aggregations/using-pre-aggregations#pre-aggregation-build-strategies

[ref-oidc-gcp-bigquery]: /admin/deployment/oidc/gcp#bigquery

[ref-schema-ref-types-formats-countdistinctapprox]: /reference/data-modeling/measures#type

[self-preaggs-batching]: #batching
