LogoLogo
WebsiteBlogLogin
  • Onum Docs
  • Use Cases
  • Videos
  • Release Notes
  • Welcome
  • Getting Started
    • About Onum
    • Architecture
    • Deployment
    • Getting Started with Onum
    • Understanding The Essentials
      • Cards and Table Views
      • Data Types
      • Graph Calculations
      • The Time Range Selector
    • Key Terminology
  • THE WORKSPACE
    • Home
    • Listeners
      • Cloud Listeners
      • Listener Integrations
        • Amazon S3
        • Amazon SQS
        • Apache Kafka
        • Azure Event Hubs
        • Cisco NetFlow
        • Google Cloud Storage
        • Google Pub/Sub
        • HTTP
        • HTTP Pull
          • Netskope integration
          • OKTA integration
          • Sophos integration
          • CrowdStrike integration
          • Cortex integration
        • Microsoft 365
        • OpenTelemetry
        • Syslog
        • TCP
        • Tick
      • Labels
    • Pipelines
      • Building a Pipeline
        • AI Assistant
          • AI Pipeline Assistant
          • AI Action Assistant
      • Listeners
      • Actions
        • Advanced
          • Anonymizer
          • Bring Your Own Code
          • Field Generator
          • For Each
          • Google DLP
          • HTTP Request
          • Redis
        • Aggregation
          • Accumulator
          • Group By
        • AI
          • Amazon GenAI
          • BLIP-2
          • Cog
          • Google GenAI
          • Llama
          • Replicate
        • Detection
          • Sigma Rules
        • Enrichment
          • Lookup
        • Filtering
          • Conditional
          • Sampling
        • Formatting
          • Message Builder
        • Schemas
          • OCSF
        • Transformation
          • Field Transformation
            • Field Transformation Operations
              • Arithmetic / Logic
                • Divide Operation
                • Median
                • Multiply Operation
                • Subtract Operation
                • Sum Operation
              • Code tidy
                • JSON Minify
              • Control characters
                • Escape String
                • Unescape String
              • Conversion
                • Convert Area
                • Convert Data Units
                • Convert Distance
                • Convert Mass
                • Convert Speed
                • List to String
                • String to List
              • Data format
                • From Base
                • From Base64
                • From Hex
                • To Base
                • To Base64
                • To Hex
              • Date / Time
                • From Unix Timestamp
                • To Timestamp
                • To Unix Timestamp
                • Translate Datetime Format
              • Encoding / Decoding
                • From Binary
                • To Binary
                • To Decimal
              • Encryption / Encoding
                • JWT Decode
              • File system permissions
                • Parse Unix file permissions
              • Format conversion
                • CSV to JSON
                • JSON to CSV
              • Hashing
                • Keccak
                • MD2
                • MD4
                • MD5
                • SHA0
                • SHA1
                • SHA2
                • SHA3
                • Shake
                • SM3
              • List manipulation
                • Index list boolean
                • Index list float
                • Index list integer
                • Index list string
                • Index list timestamp
              • Networking
                • Defang IP Address
                • Defang URL
                • Extract IP Address
                • Fang IP Address
                • Fang URLs
                • IP to Hexadecimal
                • Parse URI
                • URL Decode
                • URL Encode
              • Other
                • Parse Int
              • String
                • Length
              • Text sample adding
                • Pad Lines
              • Utils
                • Byte to Human Readable
                • Count Occurrences
                • CRC8 Checksum
                • CRC16 Checksum
                • CRC24 Checksum
                • CRC32 Checksum
                • Credit Card Obfuscator
                • Filter
                • Find and Replace
                • Regex
                • Remove Whitespace
                • Reverse String
                • Shuffle
                • Sort
                • Substring
                • Swap Case
                • To Lower Case
                • To Upper Case
          • Flat JSON
          • JSON Transformation
          • JSON Unroll
          • Math Expression
          • Parser
            • PCL (Parser Configuration Language)
        • Utils
          • Unique
      • Data sinks
      • Bulk Changes
      • Publishing & Versioning
      • Test your Pipeline
    • Data sinks
      • Data sink Integrations
        • Amazon S3
        • Amazon SQS
        • Azure Blob Storage
        • Azure Event Hubs
        • Devo
        • Google BigQuery
        • Google Cloud Storage
        • Google Pub/Sub
        • HTTP
        • Jira
        • Mail
        • Null
        • OpenTelemetry
        • PagerDuty
        • Pushover
        • Qradar
        • Relational Databases
        • ServiceNow
        • Slack
        • Splunk HEC
        • Syslog
        • TCP
        • Telegram
        • Twilio
    • Alerts
  • YOUR VAULT
    • Enrichment
    • Data History
    • Actions
  • ADMINISTRATION
    • Tenant Menu
    • Global Settings
      • Your Account
      • Organization Settings
        • Secrets Management
      • Tenant
        • Authentication
        • Users
        • Activity Log
        • API Keys
  • MARKETPLACE
    • Onum Marketplace
      • Pulling Pipelines
        • CrowdStrike Event Stream Logs - Falcon API
        • Netskope Events Alert
        • OKTA System Log API
        • Sophos Connector SIEM
Powered by GitBook
On this page
  • Overview
  • Ports
  • Configuration
  • Example

Was this helpful?

Export as PDF
  1. THE WORKSPACE
  2. Pipelines
  3. Actions
  4. Advanced

Google DLP

Most recent version: v0.0.1

PreviousFor EachNextHTTP Request

Last updated 2 months ago

Was this helpful?

See the changelog of this Action type .

Overview

The Google DLP Action is designed to integrate with Google's Data Loss Prevention (DLP) API. This Action allows detecting and classifying sensitive information, enabling workflows to comply with data protection requirements.

This Action does not generate new events. Instead, it processes incoming events to detect sensitive information based on the configured Info Types and returns the corresponding findings.

Ports

These are the input and output ports of this Action:

Input ports
  • Default port - All the events to be processed by this Action enter through this port.

Output ports
  • Error port - Events are sent through this port if an error occurs while processing them.

Configuration

1

Find Google DLP in the Actions tab (under the Advanced group) and drag it onto the canvas.

2

To open the configuration, click the Action in the canvas and select Configuration.

3

Enter the required parameters:

Parameter
Description

Info Types*

Type(s) of sensitive data to detect. You can choose as many types as needed.

Data to Inspect*

Choose the input field that contains the data to be inspected by the DLP API.

JSON credentials*

JSON object containing the credentials required to authenticate with the Google DLP API.

Output Field*

Name of the new field where the results of the DLP evaluation will be stored.

Minimum Likelihood

For each potential finding that is detected during the scan, the DLP API assigns a likelihood level. The likelihood level of a finding describes how likely it is that the finding matches an Info Type that you're scanning for. For example, it might assign a likelihood of Likely to a finding that looks like an email address.

The API will filter out any findings that have a lower likelihood than the minimum level that you set here.

The available values are:

  • Very Unlikely

  • Unlikely

  • Possible (This is the default value)

  • Likely

  • Very Likely

For example, if you set the minimum likelihood to Possible, you get only the findings that were evaluated as Possible, Likely, and Very likely. If you set the minimum likelihood to Very likely, you get the smallest number of findings.

Include Quote

If true, includes a contextual quote from the data that triggered a finding. The default value is true.

Exclude Info Types

If true, excludes type information of the findings. The default value is false.

4

Click Save to complete the process.

Example

Imagine you want to ensure that logs sent to a third-party service do not contain sensitive information such as credit card numbers, personal identification numbers, or passwords. To do it:

1

Add the Google DLP Action to your Pipeline and link it to your required Data sink.

2

Now, double-click the Google DLP Action to configure it. You need to set the following config:

Parameter
Description

Info Types

Choose the following info types:

  • Credit Card Number

  • Email Address

  • Password

Data to Inspect

Choose the input field that contains the data to be inspected by the DLP API.

JSON credentials

JSON object containing the credentials required to authenticate with the Google DLP API.

Output Field

Name of the new field where the results of the DLP evaluation will be stored.

Minimum Likelihood

We set the likelihood to Possible, as we want the right balance between recall and precision.

Include Quote

We want contextual info of the findings, so we set this to true.

Exclude Info Types

Set this to true, as we want to include type information of the findings.

3

Click Save to apply the configuration.

4

Now link the Default output port of the Action to the input port of your Data sink.

5

Finally, click Publish and choose in which clusters you want to publish the Pipeline.

6

Click Test pipeline at the top of the area and choose a specific number of events to test if your data is transformed properly. Click Debug to proceed.

This is the input data field we chose for our analysis:

{
  "Info": "My credit card number is 4111-1111-1111-1111"
}

And this is a sample output data with the corresponding results of the DLP API:

{
  "dlpFindings": {
    "findings": [
      {
        "infoType": "CREDIT_CARD_NUMBER",
        "likelihood": "VERY_LIKELY",
        "quote": "4111-1111-1111-1111"
      }
    ]
  }
}

In order to configure this action, you must first link it to a Listener. Go to to learn how this works.

Building a Pipeline
here