# Rehydrate data from Amazon S3

## Overview

This document describes a method for rehydrating data from Amazon S3 using Falcon Onum.&#x20;

The process can be useful for customers who need to re-ingest data into Falcon NG-SIEM from an S3 source. Many customers reduce or modify the original data using Onum but maintain a copy of the original event in S3 for legal requirements.

The process involves creating several artifacts:

* **S3 Bucket** (typically created by the customer)
* **SQS Queue** - This is necessary because the Onum Listener uses SQS queues to detect events in the S3 bucket
* [**Amazon S3** Listener](https://app.gitbook.com/s/kxZeV4nlXcIAjMGZxzLI/the-workspace/listeners/listener-integrations/collect-data-from-aws-products/collect-data-from-amazon-s3) in Onum
* [**Pipeline**](https://app.gitbook.com/s/kxZeV4nlXcIAjMGZxzLI/the-workspace/pipelines) in Onum
* [**Falcon NG-SIEM** Data Sink](https://app.gitbook.com/s/kxZeV4nlXcIAjMGZxzLI/the-workspace/data-sinks/data-sink-integrations/send-data-to-crowdstrike-products/send-data-to-falcon-next-gen-siem) in Onum
* **Data Connector** in Falcon NG-SIEM

## **Limitations**

At the time of this document's creation, there are some limitations:

* The [**Amazon S3** Listener](https://app.gitbook.com/s/kxZeV4nlXcIAjMGZxzLI/the-workspace/listeners/listener-integrations/collect-data-from-aws-products/collect-data-from-amazon-s3) in Onum only accepts events stored in CSV or JSON format.
* The [**Amazon S3** Listener](https://app.gitbook.com/s/kxZeV4nlXcIAjMGZxzLI/the-workspace/listeners/listener-integrations/collect-data-from-aws-products/collect-data-from-amazon-s3) in Onum cannot be used in environments with more than one deployed cluster.

## **Prerequisites**

Before configuring and starting to send data with the **Amazon S3** Listener, you need to take into consideration the following requirements:

* Your Amazon user needs at least permission to use the `GetObject` operation (S3) and the `ReceiveMessage` and `DeleteMessageBatch` operations (SQS Bucket) to make this Listener work.
* **Cross-Region Configurations**: Ensure that your S3 bucket and SQS queue are in the same AWS Region, as S3 event notifications do not support cross-region targets.
* **Permissions**: Confirm that the AWS Identity and Access Management (IAM) roles associated with your S3 bucket and SQS queue have the necessary permissions.
* **Object Key Name Filtering**: If you use special characters in your prefix or suffix filters for event notifications, ensure they are URL-encoded.

## How it works

The approach used is as follows:

* From the S3 console, the user adds a tag to the events they want to rehydrate. The tag value is not relevant; what is detected by the **Amazon S3** Listener is solely the event generated by the S3 bucket when a tag is added to the content.
* The tag creation generates an event in the S3 bucket that is sent to the SQS queue.
* The **Amazon S3** Listener detects the event and accesses the associated S3 bucket to read the content that has received the tag.
* Onum processes the events through the Pipeline and sends them to the configured **Falcon NG-SIEM** Data Sink.

## **Amazon S3 Setup** <a href="#onumrehydratedatafroms3-amazons3setup" id="onumrehydratedatafroms3-amazons3setup"></a>

You need to configure your Amazon S3 bucket to send notifications to an Amazon Simple Queue Service (SQS) queue when new files are added.

### **Create an Amazon SQS Queue**

{% stepper %}
{% step %}
Sign in to the AWS Management Console and open the Amazon SQS console.
{% endstep %}

{% step %}
Choose **Create Queue** and configure the queue settings as needed.
{% endstep %}

{% step %}
After creating the queue, note its Amazon Resource Name (ARN), which follows this format: `arn:aws:sqs:<region>:<account-id>:<queue-name>`.
{% endstep %}
{% endstepper %}

### **Modify the SQS Queue Policy to Allow S3 to Send Messages**

{% stepper %}
{% step %}
In the Amazon SQS console, select your queue.
{% endstep %}

{% step %}
Navigate to the **Access Policy** tab and choose **Edit**.
{% endstep %}

{% step %}
Replace the existing policy with the following, ensuring you update the placeholders with your specific details:

```json
{
  "Version": "2012-10-17",
  "Id": "S3ToSQSPolicy",
  "Statement": [
    {
      "Sid": "AllowS3Bucket",
      "Effect": "Allow",
      "Principal": {
        "Service": "s3.amazonaws.com"
      },
      "Action": "SQS:SendMessage",
      "Resource": "arn:aws:sqs:<region>:<account-id>:<queue-name>",
      "Condition": {
        "ArnLike": {
          "aws:SourceArn": "arn:aws:s3:::<bucket-name>"
        },
        "StringEquals": {
          "aws:SourceAccount": "<account-id>"
        }
      }
    }
  ]
}
```

{% endstep %}

{% step %}
Save the changes. This policy grants your S3 bucket permission to send messages to your SQS queue.
{% endstep %}
{% endstepper %}

### **Configure S3 Event Notifications**

{% stepper %}
{% step %}
Open the Amazon S3 console and select the bucket you want to configure.
{% endstep %}

{% step %}
Go to the **Properties** tab and find the **Event notifications** section.
{% endstep %}

{% step %}
Click on **Create event notification**.
{% endstep %}

{% step %}
Provide a descriptive name for the event notification.
{% endstep %}

{% step %}
In the **Event types** section, select **Object tagging > Object tags added**.
{% endstep %}

{% step %}
In the **Destination** section, choose **SQS Queue** and select the queue you configured earlier.
{% endstep %}

{% step %}
Save the configuration.
{% endstep %}
{% endstepper %}

## **Falcon NG-SIEM S3 Setup** <a href="#onumrehydratedatafroms3-amazons3setup" id="onumrehydratedatafroms3-amazons3setup"></a>

In Falcon NG-SIEM, users must configure the Data Connection to receive the events correctly. Store the API URL and API Secret.

## **Onum Setup** <a href="#onumrehydratedatafroms3-amazons3setup" id="onumrehydratedatafroms3-amazons3setup"></a>

We will create an **Amazon S3** Listener, a Pipeline, and a **Falcon NG-SIEM** Data Sink&#x20;

### Amazon S3 Listener

Follow the steps in [this article](https://app.gitbook.com/s/kxZeV4nlXcIAjMGZxzLI/the-workspace/listeners/listener-integrations/collect-data-from-aws-products/collect-data-from-amazon-s3).

### Falcon NG-SIEM Data Sink&#x20;

Follow the steps in [this article](https://app.gitbook.com/s/kxZeV4nlXcIAjMGZxzLI/the-workspace/data-sinks/data-sink-integrations/send-data-to-crowdstrike-products/send-data-to-falcon-next-gen-siem). Use the API URL and API Secret generated in Falcon NG-SIEM to fill the **Instance URL** and **Token** fields.

### Pipeline

In this example, the events from the S3 bucket will not be modified; therefore, the Pipeline simply collects the events from the Listener and sends them in their original format to the Data Sink.

Create a new Pipeline, put a name, drag & drop the `label All_Data` from your new Listener and  the Data Sink created in the previous step. Connect them and click the Data Sink to configure the message to be sent.

Click **Publish** and you'll be done.
