# Collect data from your databases

{% hint style="info" %}
See the changelog of the **Relational Databases** Listener [here](/listeners/relational-databases-listener.md).
{% endhint %}

{% hint style="warning" %}
Note that this Listener is only available in certain Tenants. [Get in touch with us](/support/support.md) if you don't see it and want to access it.
{% endhint %}

## Overview

The following article explains how to collect data from your databases using the **Relational Databases** Listener in Onum.

The **Relational Databases** Listener allows you to read data from a database using the **MySQL**, **Oracle**, **Postgres**, **SQL Server** and **SQLite** database management systems. Each row of the data is emitted as a separate event. The data will be stored in the `_raw` field of the events. You can configure the Listener to execute a SQL query periodically.

## Prerequisites

{% hint style="warning" %}
In order to use this Listener, you must activate the following environment variable in your distributor using docker compose: `SINGLETON_LISTENER_EXECUTOR=true`
{% endhint %}

## Onum Setup

{% stepper %}
{% step %}
Log in to your Onum tenant and click **Listeners > New listener**.
{% endstep %}

{% step %}
Double-click the **Relational Databases** Listener.
{% endstep %}

{% step %}
Enter a **Name** for the new Listener. Optionally, add a **Description** and some **Tags** to identify the Listener.
{% endstep %}

{% step %}
In the code box at the top of the **Configuration** section, enter the SQL query to execute. Then, enter the **Database Driver** and the query timeout (in seconds) in the **Query Timeout** field.

<figure><picture><source srcset="/files/w9qs6wr5REIE5uCgXBV1" media="(prefers-color-scheme: dark)"><img src="/files/f0dsIRVvoSlPbCccd8C6" alt=""></picture><figcaption></figcaption></figure>
{% endstep %}

{% step %}
In the **Tracking Column Configuration** section, you can choose to track the last value of a column to be used as pointer for the next query. This is useful if you want to read data in batches.

To do it, choose **true** in the **Use Column Value** parameter. Then, enter the required **Tracking Column** and choose the **Tracking Column Type**.

{% hint style="warning" %}
Before the first query, value is set to `Thursday, 1 January 1970` if your column type is **timestamp**, and `0` in case of **numeric**.
{% endhint %}

If you want to use this feature, you must use the `:sql_last_value` placeholder in your query.

For example, `select * from table where id > :sql_last_value` will generate a query like this: `select * from table where id > (?, $1, :param1..) limit 10000`. The placeholder will be replaced with the value of the column in the first run and in the second run it will be replaced with the value of the column in the second run.

**MySQL example**

`select * from table t where t.created_at > :sql_last_value` will generate a query like this: `select * from table t where t.created_at > ? limit 10000`.

**SQL server and Oracle examples**

* `select * from table t where t.created_at > :sql_last_value OFFSET :offset ROWS` and `t.created_at` is of type timestamp. This will generate a query like this: `select * from table t where t.created_at > ? OFFSET :offset ROWS FETCH NEXT 10000 ROWS ONLY`
* case without pagination: `select * from table t where t.created_at > :sql_last_value` and `t.created_at` is of type timestamp. will generate a query like this: `select * from table t where t.created_at > ? FETCH NEXT 10000 ROWS ONLY`. It will use the last column value as pointer.
* case without pagination and tracking column: `select * from table t`, will generate a query like this: `select * from table t OFFSET :offset ROWS FETCH NEXT 10000 ROWS ONLY`. It will update offset based on query result.
  {% endstep %}

{% step %}
In the **Pagination Params** section, you can configure the batches size. Default values are `10000` for **Limit** and `0` for **Offset**.

If you plan to use an offset in your query, you must use it as placeholder. For example, `select * from table where id > :offset`. In this case, `offset` will be set to `0` for the first run. In the second run, the offset will be set to the number of records returned in the previous run.

You don't need a placeholder for the limit. The Listener will handle this for you based on the database driver.

If you don't provide pagination configuration, the Listener will wrap every query with limit and offset statements. This is done performance reasons and to avoid memory leaks.
{% endstep %}

{% step %}
In the **Scheduler** section, you can schedule the execution of your query. It will generate a cron expression based on the given configuration. If the query execution time overlaps with the scheduled time, tasks will be skipped. It runs in single mode.

Use the **Execute Every** and **Interval Unit** fields to indicate the required execution schedule.

**Examples**

* **Execute Every**: `1` / **Interval Unit**: `seconds` - It will generate the cron expression: `"*/1 * * * * *"`. This will execute the query every second.
* **Execute Every**: `1` / **Interval Unit**: `minutes` - It will generate the cron expression: `"0 */1 * * * * "`. This will execute the query every minute.
* **Execute Every**: `2` / **Interval Unit**: `hours` - It will generate the cron expression: `"0 0 */2  * * *"`. This will execute the query every 2 hours.
  {% endstep %}

{% step %}
In the **Connection Params** section, you can simply provide a **Connection URL** or insert your **Username**, **Password**, **Host**, **Port** and **Database name**.&#x20;

If you want to enter a connection URL, you must add it to onum as a secret. To do it, open the **Connection URL** field and click **New secret**:

* Give the token a **Name**.
* Turn off the **Expiration date** option.
* Click **Add new value** and paste the secret corresponding to the JWT token you generated before. Remember that the token will be added in the Zscaler configuration.
* Click **Save**.

You can now select the URL in the **Connection URL** field.
{% endstep %}

{% step %}
In the **Persistent state** section, enter the path of the file where the last tracking column will be stored. A file with the name `state.json` will be created in the specified path.
{% endstep %}

{% step %}
Finally, click **Create labels**. Optionally, you can set labels to be used for internal Onum routing of data. By default, data will be set as **Unlabeled**. Click **Create listener** when you're done.

{% hint style="info" %}
Learn more about labels in [this article](/the-workspace/listeners/labels.md).
{% endhint %}
{% endstep %}
{% endstepper %}

Click **Create listener** when you're done.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.onum.com/the-workspace/listeners/listener-integrations/collect-data-from-your-databases.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
