Collect data from your databases

See the changelog of the Relational Databases Listener here.

Note that this Listener is only available in certain Tenants. Get in touch with us if you don't see it and want to access it.

Overview

The following article explains how to collect data from your databases using the Relational Databases Listener in Onum.

The Relational Databases Listener allows you to read data from a database using the MySQL, Oracle, Postgres, SQL Server and SQLite database management systems. Each row of the data is emitted as a separate event. The data will be stored in the _raw field of the events. You can configure the Listener to execute a SQL query periodically.

Onum Setup

Double-click the Relational Databases Listener.

Enter a Name for the new Listener. Optionally, add a Description and some Tags to identify the Listener.

In the code box at the top of the Configuration section, enter the SQL query to execute. Then, enter the Database Driver and the query timeout (in seconds) in the Query Timeout field.

In the Tracking Column Configuration section, you can choose to track the last value of a column to be used as pointer for the next query. This is useful if you want to read data in batches.

To do it, choose true in the Use Column Value parameter. Then, enter the required Tracking Column and choose the Tracking Column Type.

Before the first query, value is set to Thursday, 1 January 1970 if your column type is timestamp, and 0 in case of numeric.

If you want to use this feature, you must use the :sql_last_value placeholder in your query.

For example, select * from table where id > :sql_last_value will generate a query like this: select * from table where id > (?, $1, :param1..) limit 10000. The placeholder will be replaced with the value of the column in the first run and in the second run it will be replaced with the value of the column in the second run.

MySQL example

select * from table t where t.created_at > :sql_last_value will generate a query like this: select * from table t where t.created_at > ? limit 10000.

SQL server and Oracle examples

select * from table t where t.created_at > :sql_last_value OFFSET :offset ROWS and t.created_at is of type timestamp. This will generate a query like this: select * from table t where t.created_at > ? OFFSET :offset ROWS FETCH NEXT 10000 ROWS ONLY
case without pagination: select * from table t where t.created_at > :sql_last_value and t.created_at is of type timestamp. will generate a query like this: select * from table t where t.created_at > ? FETCH NEXT 10000 ROWS ONLY. It will use the last column value as pointer.
case without pagination and tracking column: select * from table t, will generate a query like this: select * from table t OFFSET :offset ROWS FETCH NEXT 10000 ROWS ONLY. It will update offset based on query result.

In the Pagination Params section, you can configure the batches size. Default values are 10000 for Limit and 0 for Offset.

If you plan to use an offset in your query, you must use it as placeholder. For example, select * from table where id > :offset. In this case, offset will be set to 0 for the first run. In the second run, the offset will be set to the number of records returned in the previous run.

You don't need a placeholder for the limit. The Listener will handle this for you based on the database driver.

If you don't provide pagination configuration, the Listener will wrap every query with limit and offset statements. This is done performance reasons and to avoid memory leaks.

In the Scheduler section, you can schedule the execution of your query. It will generate a cron expression based on the given configuration. If the query execution time overlaps with the scheduled time, tasks will be skipped. It runs in single mode.

Use the Execute Every and Interval Unit fields to indicate the required execution schedule.

Examples

Execute Every: 1 / Interval Unit: seconds - It will generate the cron expression: "*/1 * * * * *". This will execute the query every second.
Execute Every: 1 / Interval Unit: minutes - It will generate the cron expression: "0 */1 * * * * ". This will execute the query every minute.
Execute Every: 2 / Interval Unit: hours - It will generate the cron expression: "0 0 */2 * * *". This will execute the query every 2 hours.

In the Connection Params section, you can simply provide a Connection URL or insert your Username, Password, Host, Port and Database name.

If you want to enter a connection URL, you must add it to onum as a secret. To do it, open the Connection URL field and click New secret:

Give the token a Name.
Turn off the Expiration date option.
Click Add new value and paste the secret corresponding to the JWT token you generated before. Remember that the token will be added in the Zscaler configuration.
Click Save.

You can now select the URL in the Connection URL field.

In the Persistent state section, enter the path of the file where the last tracking column will be stored. A file with the name state.json will be created in the specified path.

Finally, click Create labels. Optionally, you can set labels to be used for internal Onum routing of data. By default, data will be set as Unlabeled. Click Create listener when you're done.

Learn more about labels in this article.

PreviousCollect data from Apache Kafka NextCollect data using HTTP

Last updated 14 hours ago

Was this helpful?