Google DLP
Moat recent version: v0.0.1
Last updated
Was this helpful?
Moat recent version: v0.0.1
Last updated
Was this helpful?
See the changelog of this Action type .
The Google DLP Action is designed to integrate with Google's Data Loss Prevention (DLP) API. This Action allows detecting and classifying sensitive information, enabling workflows to comply with data protection requirements.
This Action does not generate new events. Instead, it processes incoming events to detect sensitive information based on the configured Info Types
and returns the corresponding findings.
In order to configure this action, you must first link it to a Listener. Go to Building a Pipeline to learn how this works.
These are the input and output ports of this Action:
Find Google DLP in the Actions tab (under the Advanced group) and drag it onto the canvas.
To open the configuration, click the Action in the canvas and select Configuration.
Enter the required parameters:
Info Types*
Type(s) of sensitive data to detect. You can choose as many types as needed.
Data to Inspect*
Choose the input field that contains the data to be inspected by the DLP API.
JSON credentials*
JSON object containing the credentials required to authenticate with the Google DLP API.
Output Field*
Name of the new field where the results of the DLP evaluation will be stored.
Minimum Likelihood
For each potential finding that is detected during the scan, the DLP API assigns a likelihood level. The likelihood level of a finding describes how likely it is that the finding matches an Info Type
that you're scanning for. For example, it might assign a likelihood of Likely
to a finding that looks like an email address.
The API will filter out any findings that have a lower likelihood than the minimum level that you set here.
The available values are:
Very Unlikely
Unlikely
Possible
(This is the default value)
Likely
Very Likely
For example, if you set the minimum likelihood to Possible
, you get only the findings that were evaluated as Possible
, Likely
, and Very likely
. If you set the minimum likelihood to Very likely
, you get the smallest number of findings.
Include Quote
If true
, includes a contextual quote from the data that triggered a finding. The default value is true
.
Exclude Info Types
If true
, excludes type information of the findings. The default value is false
.
Click Save to complete the process.
Imagine you want to ensure that logs sent to a third-party service do not contain sensitive information such as credit card numbers, personal identification numbers, or passwords. To do it:
Add the Google DLP Action to your Pipeline and link it to your required Data sink.
Now, double-click the Google DLP Action to configure it. You need to set the following config:
Info Types
Choose the following info types:
Credit Card Number
Email Address
Password
Data to Inspect
Choose the input field that contains the data to be inspected by the DLP API.
JSON credentials
JSON object containing the credentials required to authenticate with the Google DLP API.
Output Field
Name of the new field where the results of the DLP evaluation will be stored.
Minimum Likelihood
We set the likelihood to Possible
, as we want the right balance between recall and precision.
Include Quote
We want contextual info of the findings, so we set this to true
.
Exclude Info Types
Set this to true
, as we want to include type information of the findings.
Click Save to apply the configuration.
Now link the Default output port of the Action to the input port of your Data sink.
Finally, click Publish and choose in which clusters you want to publish the Pipeline.
Click Test pipeline at the top of the area and choose a specific number of events to test if your data is transformed properly. Click Debug to proceed.
This is the input data field we chose for our analysis:
And this is a sample output data with the corresponding results of the DLP API: