Collect data from Google Cloud Storage
See the changelog of the Google Cloud Storage Listener here.
The Google Cloud Storage Listener is only available in certain Tenants. Get in touch with us if you don't see it and want to access it.
The Google Cloud Storage Listener is a Pull Listener and therefore should not be used in environments with more than one cluster.
Overview
Onum supports integration with Google Cloud Storage.
Google Cloud Storage is an online object storage service that allows users to store and retrieve data. It is a managed service, meaning Google handles the underlying infrastructure, making it scalable and reliable. GCS is designed for a variety of use cases, including storing data for web applications, big data analytics, and backups.
Select Google Cloud Storage from the list of Listener types and click Configuration to start.
Prerequisites
In order to use this Listener, you must activate the environment variable in your distributor using docker compose (GOOGLE_CLOUD_STORAGE_LISTENER_EXECUTION_ENABLED)
Google Cloud Storage Setup
To source data from Google Cloud Storage you need to have a GCS bucket with data, appropriate permissions (like Storage Admin) to access the bucket and its objects, and the correct resource path (e.g., gs://bucket-name/object-name).
See the Google Cloud Storage manual for help.
Onum Setup
Log in to your Onum tenant and click Listeners > New listener.
Double-click the Google Cloud Storage Listener.
Enter a Name* for the new Listener. Optionally, add a Description and some Tags to identify the Listener.
The Google Cloud connector uses OAuth 2.0 credentials for authentication and authorization. In the Credentials file* field, create a new Secret containing these credentials or select one already created. To get it:
To find the Google Cloud credentials file, go to Settings > Interoperability.
Scroll down to the Service Account area.
You need to generate and download a service account key from the Google Cloud Console. You will not be able to view this key, so you must have it copied somewhere already. Otherwise, create one here and save it to paste here.
To see existing Service Accounts, go to the menu in the top left and select APIs & Services > Credentials.
Learn more about secrets in Onum in this article.
Assign an optional Event delimiter to split file content into different events using a delimiter (Examples: -, \n, \r\n, 0x0A...).
Choose the Compression type* for your files (None, Gzip, Bzip2 or Auto).
If you set the Read Bucket Once parameter to true, the Listener will read the entire bucket once and stop the execution. You'll be prompted to enter the following:
Prefix - The optional string that acts like a folder path or directory structure when organizing objects within a bucket.
Bucket* - Enter the GCP bucket name.
Start at* - This will block the Listener from starting until this timestamp. The required date format is
DD/MM/YYYY HH:mm. The specified time must be in the future and conform to the timezone where the operation is being executed.
The Project ID* is a unique string with the following format: my-project-123456. To get it:
Go to the Google Cloud Console.
In the top left corner, click on the project drop-down next to the Google Cloud logo (where your current project name is shown).
Each project will have a Project Name and a Project ID.
You can also find it in the Settings tab on the left-hand side.
Enter your Subscription (called Subscription ID in the Cloud Console). Follow these steps to get it:
Go to Pub/Sub in the Google Cloud Console.
In the top left corner, click on the menu and select View all Products.
Then go to Analytics and find Pub/Sub. Click it to go to Pub/Sub (you can also use the search bar and type
Pub/Sub).In the Pub/Sub dashboard, select the Subscriptions tab on the left.
The Subscription ID will be displayed in this list.
In case of a failure to connect, enter the following parameters:
Number of retries* - Enter the maximum number of retries to perform in case of a failure. The minimum value is
1, and the maximum value is5. The default value is3.Retry delay* - Enter the number of milliseconds to wait between retries. The minimum and default value is
100, and the maximum value is1000.
Finally, click Create labels. Optionally, you can set labels to be used for internal Onum routing of data. By default, data will be set as Unlabeled.
Learn more about labels in this article.
Click Create listener when you're done.
Output Ports
The Google Cloud Storage Listener has two output ports:
Default port - Events are sent through this port if no error occurs while processing them.
Error port - Events are sent through this port if an error occurs while processing them.
Last updated
Was this helpful?

