Send data to Google BigQuery

Most recent version: v0.0.1

See the changelog of the Google Big Query Data sink type here.

Overview

Onum supports integration with Google BigQuery.

Google BigQuery is an autonomous data-to-AI platform, automating the entire data life cycle, from ingestion to AI-driven insights, so you can go from data to AI to action faster.

Prerequisites

You will need:

a Google Cloud account
an existing project in the Google Cloud Console

Google BigQuery Setup

Once you have your project to connect to, make sure your user or service account has the proper roles:

BigQuery Admin (roles/bigquery.admin) – full access.
BigQuery Data Editor/Viewer – if you only need to write or read data.
BigQuery Job User (roles/bigquery.jobUser) – required to run queries.

Go to IAM & Admin > IAM > Add Principal and choose a role like BigQuery Admin.

Enable these in your project:

BigQuery API
Cloud Storage API

Make a note of the Project ID for later use.

Onum setup

Double-click the Google BigQuery Sink.

Enter a Name for the new Data Sink. Optionally, add a Description and some Tags to identify the Sink.

Decide whether or not to include this Data sink info in the metrics and graphs of the Home area.

Enter the Project ID* unique string with the following format my-project-123456.

To get it:

Go to the Google Cloud Console.
In the top left corner, click on the project drop-down next to the Google Cloud logo (where your current project name is shown).
Each project will have a Project Name and a Project ID.
You can also find it in the Settings tab on the left-hand side.

The Google BigQuery connector uses OAuth 2.0 credentials for authentication and authorization.

Enter the Credentials File* by creating a secret containing these credentials or select one already created. To find the credentials file:

Go to Google Cloud Settings > Interoperability.
Scroll down to the Service Account area.
You need to generate and download a service account key from the Google Cloud Console. You will not be able to view this key, so you must have it copied somewhere already. Otherwise, create one here and save it to paste here.
To see existing Service Accounts, go to the menu in the top left and select APIs & Services > Credentials.

Click New secret to create a new one:

Give the secret a Name.
Turn off the Expiration date option.
Click Add new value and paste the secret corresponding to the JWT token you generated before. Remember that the token will be added in the Zscaler configuration.
Click Save.

Learn more about secrets in Onum in this article.

Click Create data sink when complete.

Your new Data sink will appear in the Data sinks area list.

Pipeline configuration

When it comes to using this Data sink in a Pipeline, you must configure the following output parameters. To do it, simply click the Data sink on the canvas and select Configuration.

Output configuration

Data to insert

Here is where you decide how your data will appear in your BigQuery project.

Dataset* Give a name to the dataset that will appear in your BigQuery storage.
Table* Enter a name for the table to insert the values into.

Column / Value pairs

Click Add Element to add as many pairs as needed.

Column* Enter a name for the column that will appear in Big Query.
Value* Select the incoming event that contains the value to send.

Bulk options

If you wish to enable bulk sending, set Bulk allow* to True. Otherwise, set it to false.

Enter the Event max amount* number of events to accumulate into a bulk. The minimum value is 0 and the maximum value is 5000.

The Event max time in seconds* is the number of seconds to wait before considering the bulk as full and sending it on. The minimum value is 0.

Click Save to complete the process.

PreviousSend data to Google Cloud products NextSend data to Google Cloud Storage

Last updated 19 days ago

Was this helpful?

Good afternoon