Collect data from your databases

See the changelog of the Relational Databases Listener here.

Overview

The following article explains how to collect data from your databases using the Relational Databases Listener in Onum.

The Relational Databases Listener allows you to read data from a database using the MySQL, Oracle, Postgres, SQL Server and SQLite database management systems. Each row of the data is emitted as a separate event. The data will be stored in the _raw field of the events. You can configure the Listener to execute a SQL query periodically.

Onum Setup

1

Log in to your Onum tenant and click Listeners > New listener.

2

Double-click the Relational Databases Listener.

3

Enter a Name for the new Listener. Optionally, add a Description and some Tags to identify the Listener.

4

In the code box at the top of the Configuration section, enter the SQL query to execute. Then, enter the Database Driver and the query timeout (in seconds) in the Query Timeout field.

5

In the Tracking Column Configuration section, you can choose to track the last value of a column to be used as pointer for the next query. This is useful if you want to read data in batches.

To do it, choose true in the Use Column Value parameter. Then, enter the required Tracking Column and choose the Tracking Column Type.

If you want to use this feature, you must use the :sql_last_value placeholder in your query.

For example, select * from table where id > :sql_last_value will generate a query like this: select * from table where id > (?, $1, :param1..) limit 10000. The placeholder will be replaced with the value of the column in the first run and in the second run it will be replaced with the value of the column in the second run.

MySQL example

select * from table t where t.created_at > :sql_last_value will generate a query like this: select * from table t where t.created_at > ? limit 10000.

SQL server and Oracle examples

  • select * from table t where t.created_at > :sql_last_value OFFSET :offset ROWS and t.created_at is of type timestamp. This will generate a query like this: select * from table t where t.created_at > ? OFFSET :offset ROWS FETCH NEXT 10000 ROWS ONLY

  • case without pagination: select * from table t where t.created_at > :sql_last_value and t.created_at is of type timestamp. will generate a query like this: select * from table t where t.created_at > ? FETCH NEXT 10000 ROWS ONLY. It will use the last column value as pointer.

  • case without pagination and tracking column: select * from table t, will generate a query like this: select * from table t OFFSET :offset ROWS FETCH NEXT 10000 ROWS ONLY. It will update offset based on query result.

6

In the Pagination Params section, you can configure the batches size. Default values are 10000 for Limit and 0 for Offset.

If you plan to use an offset in your query, you must use it as placeholder. For example, select * from table where id > :offset. In this case, offset will be set to 0 for the first run. In the second run, the offset will be set to the number of records returned in the previous run.

You don't need a placeholder for the limit. The Listener will handle this for you based on the database driver.

If you don't provide pagination configuration, the Listener will wrap every query with limit and offset statements. This is done performance reasons and to avoid memory leaks.

7

In the Scheduler section, you can schedule the execution of your query. It will generate a cron expression based on the given configuration. If the query execution time overlaps with the scheduled time, tasks will be skipped. It runs in single mode.

Use the Execute Every and Interval Unit fields to indicate the required execution schedule.

Examples

  • Execute Every: 1 / Interval Unit: seconds - It will generate the cron expression: "*/1 * * * * *". This will execute the query every second.

  • Execute Every: 1 / Interval Unit: minutes - It will generate the cron expression: "0 */1 * * * * ". This will execute the query every minute.

  • Execute Every: 2 / Interval Unit: hours - It will generate the cron expression: "0 0 */2 * * *". This will execute the query every 2 hours.

8

In the Connection Params section, you can simply provide a Connection URL or insert your Username, Password, Host, Port and Database name.

If you want to enter a connection URL, you must add it to onum as a secret. To do it, open the Connection URL field and click New secret:

  • Give the token a Name.

  • Turn off the Expiration date option.

  • Click Add new value and paste the secret corresponding to the JWT token you generated before. Remember that the token will be added in the Zscaler configuration.

  • Click Save.

You can now select the URL in the Connection URL field.

9

In the Persistent state section, enter the path of the file where the last tracking column will be stored. A file with the name state.json will be created in the specified path.

10

Finally, click Create labels. Optionally, you can set labels to be used for internal Onum routing of data. By default, data will be set as Unlabeled. Click Create listener when you're done.

Last updated

Was this helpful?