Amazon S3
Most recent version: v1.0.0
Last updated
Was this helpful?
Most recent version: v1.0.0
Last updated
Was this helpful?
Amazon S3 is an object storage service that stores and protects any amount of data for a wide range of use cases, including data lakes, websites, cloud-native applications, backups, archives, machine learning, and analytics.
Select Amazon S3 from the list of Data sink types and click Configuration to start.
Now you need to specify how and where to send the data, and how to establish a connection with Amazon S3.
Enter the basic information for the new Data sink.
Name*
Enter a name for the new Data sink.
Description
Optionally, enter a description for the Data sink.
Tags
Add tags to easily identify your Data sink. Hit the Enter
key after you define each tag.
Now, add the configuration to establish the connection.
Enter the specific configuration for AWS. You'll find this data in the General purpose buckets area of your Amazon S3 account.
Bucket*
Region*
Choose the region the cloud server is found in, also found in your General purpose buckets area, next to the name.
S3 objects are files or data sets that are stored in a bucket. Each object is identified by a key that uses prefixes to simulate a folder structure. Click the bucket name to view its Objects and properties. Click an object to open it and see the following parameters.
Storage class
Canned ACL
Global prefix
Add a static prefix for all the object keys.
Only if your Bucket requires authorization.
Access key ID*
In the left panel, click on Users.
Select your IAM user.
Under the Security Credentials tab, scroll to Access Keys and you will find existing Access Key IDs (but not the secret access key).
Secret access key*
Under Access keys, you can see your Access Key IDs, but AWS will not show the Secret Access Key. You must have it saved somewhere. If you don't have the secret key saved, you need to create a new one
Max object size / Input size
Enter the maximum size of each object (in MB) that is sent to the S3 bucket.
Use Max object size if you select Raw as the format in the output configuration.
Use Input size if you select Parquet as the format in the output configuration.
Instead of partitioning by time, you can partition by the size of the message. Assign here the object's maximum size (in MB). If you do not select a Partition by value, a new object is created upon reaching this limit.
For both options, the minimum value is 1
, and the maximum value is 5243000
. The default value is 100
.
Custom endpoint
If you have one, enter your custom endpoint.
If your edge services are deployed on-premises, make sure to check your available disk space. This is because setting an Input size greater than the disk space available may lead to technical issues with workers or infrastructure.
Click Finish when complete.
Choose whether the event Format is Raw or Parquet. Depending on the format selected, you'll be prompted to fill in the corresponding parameters:
Event field*
This is the name of the input event field.
Framing method*
This parameter defines how events are separated within an S3 object (further defined in the S3 object section of the Data sink). Choose between the various options:
Newline - Uses a newline character ('\n') to separate individual records in the output.
Length - The S3 framing method length is 10 bytes.
No framing - All events are contained in one line, leading to a long line until the maximum size is reached, with only one region.
Compress data?
Choose between true/false to enable/disable compression.
Choose the format for the name of the objects:
Prefix
The prefix used to organize your S3 data.
Partition by
This indicates the frequency with which to generate a new S3 object e.g. every year, month, day hour, minute. If left blank, the value used will be the Max object size / Input size entered in the Data sink configuration.
Click Save to save your configuration.
Onum supports integration with .
Decide whether or not to include this Data sink info in the metrics and graphs of the area.
The your data is stored in. This is the bucket Name found in your General purpose buckets area.
The desired See this in the main objects table, or by clicking the object and going to Storage Class.
Choose the
Add the access key from your or create one. The Access Key ID is found in the IAM Dashboard of the AWS Management Console.
Add the secret access key from your or create one.
When it comes to using this Data sink in a , you must configure the following output parameters. To do it, simply click the Data sink on the canvas and select Configuration.