LogoLogo
WebsiteBlogLogin
  • Onum Docs
  • Use Cases
  • Videos
  • Release Notes
  • Welcome
  • Getting Started
    • About Onum
    • Architecture
    • Deployment
    • Getting Started with Onum
    • Understanding The Essentials
      • Cards and Table Views
      • Data Types
      • Graph Calculations
      • The Time Range Selector
    • Key Terminology
  • THE WORKSPACE
    • Home
    • Listeners
      • Cloud Listeners
      • Listener Integrations
        • Amazon S3
        • Amazon SQS
        • Apache Kafka
        • Azure Event Hubs
        • Cisco NetFlow
        • Google Cloud Storage
        • Google Pub/Sub
        • HTTP
        • HTTP Pull
          • Netskope integration
          • OKTA integration
          • Sophos integration
          • CrowdStrike integration
          • Cortex integration
        • Microsoft 365
        • OpenTelemetry
        • Syslog
        • TCP
        • Tick
      • Labels
    • Pipelines
      • Building a Pipeline
        • AI Assistant
          • AI Pipeline Assistant
          • AI Action Assistant
      • Listeners
      • Actions
        • Advanced
          • Anonymizer
          • Bring Your Own Code
          • Field Generator
          • For Each
          • Google DLP
          • HTTP Request
          • Redis
        • Aggregation
          • Accumulator
          • Group By
        • AI
          • Amazon GenAI
          • BLIP-2
          • Cog
          • Google GenAI
          • Llama
          • Replicate
        • Detection
          • Sigma Rules
        • Enrichment
          • Lookup
        • Filtering
          • Conditional
          • Sampling
        • Formatting
          • Message Builder
        • Schemas
          • OCSF
        • Transformation
          • Field Transformation
            • Field Transformation Operations
              • Arithmetic / Logic
                • Divide Operation
                • Median
                • Multiply Operation
                • Subtract Operation
                • Sum Operation
              • Code tidy
                • JSON Minify
              • Control characters
                • Escape String
                • Unescape String
              • Conversion
                • Convert Area
                • Convert Data Units
                • Convert Distance
                • Convert Mass
                • Convert Speed
                • List to String
                • String to List
              • Data format
                • From Base
                • From Base64
                • From Hex
                • To Base
                • To Base64
                • To Hex
              • Date / Time
                • From Unix Timestamp
                • To Timestamp
                • To Unix Timestamp
                • Translate Datetime Format
              • Encoding / Decoding
                • From Binary
                • To Binary
                • To Decimal
              • Encryption / Encoding
                • JWT Decode
              • File system permissions
                • Parse Unix file permissions
              • Format conversion
                • CSV to JSON
                • JSON to CSV
              • Hashing
                • Keccak
                • MD2
                • MD4
                • MD5
                • SHA0
                • SHA1
                • SHA2
                • SHA3
                • Shake
                • SM3
              • List manipulation
                • Index list boolean
                • Index list float
                • Index list integer
                • Index list string
                • Index list timestamp
              • Networking
                • Defang IP Address
                • Defang URL
                • Extract IP Address
                • Fang IP Address
                • Fang URLs
                • IP to Hexadecimal
                • Parse URI
                • URL Decode
                • URL Encode
              • Other
                • Parse Int
              • String
                • Length
              • Text sample adding
                • Pad Lines
              • Utils
                • Byte to Human Readable
                • Count Occurrences
                • CRC8 Checksum
                • CRC16 Checksum
                • CRC24 Checksum
                • CRC32 Checksum
                • Credit Card Obfuscator
                • Filter
                • Find and Replace
                • Regex
                • Remove Whitespace
                • Reverse String
                • Shuffle
                • Sort
                • Substring
                • Swap Case
                • To Lower Case
                • To Upper Case
          • Flat JSON
          • JSON Transformation
          • JSON Unroll
          • Math Expression
          • Parser
            • PCL (Parser Configuration Language)
        • Utils
          • Unique
      • Data sinks
      • Bulk Changes
      • Publishing & Versioning
      • Test your Pipeline
    • Data sinks
      • Data sink Integrations
        • Amazon S3
        • Amazon SQS
        • Azure Blob Storage
        • Azure Event Hubs
        • Devo
        • Google BigQuery
        • Google Cloud Storage
        • Google Pub/Sub
        • HTTP
        • Jira
        • Mail
        • Null
        • OpenTelemetry
        • PagerDuty
        • Pushover
        • Qradar
        • Relational Databases
        • ServiceNow
        • Slack
        • Splunk HEC
        • Syslog
        • TCP
        • Telegram
        • Twilio
    • Alerts
  • YOUR VAULT
    • Enrichment
    • Data History
    • Actions
  • ADMINISTRATION
    • Tenant Menu
    • Global Settings
      • Your Account
      • Organization Settings
        • Secrets Management
      • Tenant
        • Authentication
        • Users
        • Activity Log
        • API Keys
  • MARKETPLACE
    • Onum Marketplace
      • Pulling Pipelines
        • CrowdStrike Event Stream Logs - Falcon API
        • Netskope Events Alert
        • OKTA System Log API
        • Sophos Connector SIEM
Powered by GitBook
On this page
  • Overview
  • Data sink configuration
  • Metadata
  • Metrics display
  • Configuration
  • Pipeline configuration
  • Output configuration

Was this helpful?

Export as PDF
  1. THE WORKSPACE
  2. Data sinks
  3. Data sink Integrations

Amazon S3

Most recent version: v1.0.0

PreviousData sink IntegrationsNextAmazon SQS

Last updated 3 days ago

Was this helpful?

See the changelog of this Data sink type .

Overview

Amazon S3 is an object storage service that stores and protects any amount of data for a wide range of use cases, including data lakes, websites, cloud-native applications, backups, archives, machine learning, and analytics.

Select Amazon S3 from the list of Data sink types and click Configuration to start.

Data sink configuration

Now you need to specify how and where to send the data, and how to establish a connection with Amazon S3.

Metadata

Enter the basic information for the new Data sink.

Parameters
Description

Name*

Enter a name for the new Data sink.

Description

Optionally, enter a description for the Data sink.

Tags

Add tags to easily identify your Data sink. Hit the Enter key after you define each tag.


Metrics display


Configuration

Now, add the configuration to establish the connection.

AWS

Enter the specific configuration for AWS. You'll find this data in the General purpose buckets area of your Amazon S3 account.

Parameter
Description

Bucket*

Region*

Choose the region the cloud server is found in, also found in your General purpose buckets area, next to the name.

S3 object

S3 objects are files or data sets that are stored in a bucket. Each object is identified by a key that uses prefixes to simulate a folder structure. Click the bucket name to view its Objects and properties. Click an object to open it and see the following parameters.

Parameter
Description

Storage class

Canned ACL

Global prefix

Add a static prefix for all the object keys.

Auth

Only if your Bucket requires authorization.

Parameter
Description

Access key ID*

  1. In the left panel, click on Users.

  2. Select your IAM user.

  3. Under the Security Credentials tab, scroll to Access Keys and you will find existing Access Key IDs (but not the secret access key).

Secret access key*

Under Access keys, you can see your Access Key IDs, but AWS will not show the Secret Access Key. You must have it saved somewhere. If you don't have the secret key saved, you need to create a new one

Advanced options

Parameter
Description

Max object size / Input size

Enter the maximum size of each object (in MB) that is sent to the S3 bucket.

  • Use Max object size if you select Raw as the format in the output configuration.

  • Use Input size if you select Parquet as the format in the output configuration.

Instead of partitioning by time, you can partition by the size of the message. Assign here the object's maximum size (in MB). If you do not select a Partition by value, a new object is created upon reaching this limit.

For both options, the minimum value is 1 , and the maximum value is 5243000. The default value is 100.

Custom endpoint

If you have one, enter your custom endpoint.

If your edge services are deployed on-premises, make sure to check your available disk space. This is because setting an Input size greater than the disk space available may lead to technical issues with workers or infrastructure.

Click Finish when complete.

Pipeline configuration

Output configuration

Format

Choose whether the event Format is Raw or Parquet. Depending on the format selected, you'll be prompted to fill in the corresponding parameters:

Parameter
Description

Event field*

This is the name of the input event field.

Framing method*

This parameter defines how events are separated within an S3 object (further defined in the S3 object section of the Data sink). Choose between the various options:

  • Newline - Uses a newline character ('\n') to separate individual records in the output.

  • Length - The S3 framing method length is 10 bytes.

  • No framing - All events are contained in one line, leading to a long line until the maximum size is reached, with only one region.

Compress data?

Choose between true/false to enable/disable compression.

Parameter
Description

Event fields*

Separate the raw event into fields. Give the field a name and add as many fields as required by clicking Add element.

Key format

Choose the format for the name of the objects:

Parameter
Description

Prefix

The prefix used to organize your S3 data.

Partition by

This indicates the frequency with which to generate a new S3 object e.g. every year, month, day hour, minute. If left blank, the value used will be the Max object size / Input size entered in the Data sink configuration.

Click Save to save your configuration.

Onum supports integration with .

Decide whether or not to include this Data sink info in the metrics and graphs of the area.

The your data is stored in. This is the bucket Name found in your General purpose buckets area.

The desired See this in the main objects table, or by clicking the object and going to Storage Class.

Choose the

Add the access key from your or create one. The Access Key ID is found in the IAM Dashboard of the AWS Management Console.

Add the secret access key from your or create one.

When it comes to using this Data sink in a , you must configure the following output parameters. To do it, simply click the Data sink on the canvas and select Configuration.

Amazon S3
Home
Pipeline
AWS bucket
S3 storage class.
S3 Access Control List.
Secrets
Secrets
here