Collect data from Azure Event Hubs

Most recent version: v1.0.1

See the changelog of the Azure Event Hubs Listener here.

Overview

Onum supports integration with Azure Event Hubs

The Azure Event Hubs Listener receives messages from an Azure Event Hub for real-time data streaming, providing support for message batching, retries, and secure connection options.

Prerequisites

In order to use this Listener, you must activate the Environment Variable in your distributor using docker compose (AZURE_EVENTHUB_LISTENER_EXECUTION_ENABLED)

Azure Event Hubs Setup

There are various management credentials that Onum needs to communicate with the event hub.

  • Event Hubs namespace

  • Event hub

See the Azure Event Hubs documentation for how to create these.

Onum Setup

1

Log in to your Onum tenant and click Listeners > New listener.

2

Double-click the Azure Event Hubs Listener.

3

Enter a Name for the new Listener. Optionally, add a Description and some Tags to identify the Listener.

4

Establish the Event Hub Connection

  • Enter the Event Hub Namespace* to connect to (e.g. my-namespace.servicebus.windows.net). You can find this in the top left-hand corner of you Azure area.

  • In your Azure console, click your Event Hubs namespace to view the Hubs it contains in the middle pane and enter it in the Event Hub Name* field. Alternatively, click Event Hub to create one.

  • In the left-hand menu, scroll down to Entities and click Consumer groups to see the names. This value is $Default when empty.

  • In the left-hand menu, scroll down to Entities and click Consumer groups to see the names. This value is $Default when empty.

5

In the Authentication section, choose between Connection String and Entra ID as the Authentication Type.

  • Connection String

    • Connection String* The URL for your Event Hub. To get it:

      1. Click your Event Hubs namespace to view the Hubs it contains.

      2. Scroll down to the bottom and click the specific event hub to connect to.

      3. In the left menu, go to Shared Access Policies.

      4. If there is no policy created for an event hub, create one with Manage, Send, or Listen access.

      5. Select the policy from the list.

      6. Select the copy button next to the Connection string-primary key field.

      Depending on the version of Azure you are using, the corresponding field may have a different name, so to help you find it, look for a string with the same format: Endpoint=sb://.servicebus.windows.net/; SharedAccessKeyName=RootManageSharedAccessKey; SharedAccessKey=

  • Entra ID - enter the following credentials from the Certificates & Secrets area

    • Tenant ID*

    • Client ID*

    • Client Secret*

Open the Secret fields and click New secret to create a new one:

  • Give the token a Name.

  • Turn off the Expiration date option.

  • Click Add new value and paste the secret corresponding to the JWT token you generated before. Remember that the token will be added in the Zscaler configuration.

  • Click Save.

6

You can now select the secret you just created in the corresponding fields.

7

Checkpointing & Processor

When multiple consumer instances read from the same Event Hub and consumer group, a cooperative processor coordinates partition ownership and progress using a checkpoint store (Azure Blob Storage).

  • Ensures at-least-once processing without duplicates when instances restart: committed checkpoints allow new owners to resume from the last processed offset instead of re-reading the whole partition.

  • Evenly distributes partitions across active instances (load balancing): with the balanced strategy, ownership is redistributed as instances join/leave; greedy tries to acquire as many partitions as possible.

  • Enables safe horizontal scaling: adding instances increases throughput by processing multiple partitions in parallel.

Learn more in the Azure Event Hubs documentation:

8

Enter the Storage Container Name* In the left-hand menu, scroll down to Resource groups, where you will see a list of all the storage containers within your Event Hub. In Onum, enter the name of the blob container to persist checkpoints and ownership.

9

The Storage Connection String parameter is a secret, therefore you must add this string in the Secrets area, or select it from the list if you have already done so.

See here for where to find it in the Azure portal.

10

Then, configure the Processor Options.

  • Load Balancing Strategy

    Choose how to distribute the work evenly across the server to avoid overload.

    • Balanced - distributes load evenly across all servers.

    • Greedy - assigns each new task immediately to the currently leadt-loaded server.

  • Update Interval (ms) How often a processor renews partition ownership; defaults to 10000ms if unset.

  • Partition Expiration Duration (ms) Enter a time limit in milliseconds, after which the load partition will be considered expired and can be claimed by other instances.

11

Decide whether to Use batch settings.

When false, the handler processes events one-by-one using internal defaults (maxBatchSize=1, maxWaitTimeMs=500). When true, batch processing settings apply.

  • Max Batch Size* Enter the maximum bytes for the batch.

  • Max Wait Time* Enter the maximum amount of milliseconds to wait before considering the batch as complete.

12

Add the Backoff Settings regarding how long to wait before retrying a request after failure.

  • Error Backoff (ms) Enter the amount of milliseconds to wait after an error before retrying.

  • Idle Backoff (ms) Enter the amount of milliseconds to wait before trying again to send a request.

13

Choose the Decompression method used to restore a compressed message to its original form before being processed (none, gzip or zlib)

14

Choose the Split Strategy method of dividing the data or requests from the following delimiter options:

  • None to ignore

  • Newline

  • JSON array

  • JSON object

  • Custom Delimiter

    • Custom Delimiter - enter your custom delimiter here.

15

Finally, click Create labels. Optionally, you can set labels to be used for internal Onum routing of data. By default, data will be set as Unlabeled. Click Create listener when you're done.

Click Create listener when you're done.

Last updated

Was this helpful?