# Extract URLs

## Description

This operation extracts URLs from a given input text, supporting HTTP, HTTPs, FTP and other protocols.

***

## Data types

These are the input/output expected data types for this operation:

### Input data

<img src="/files/5znsr6RzLbBq94l3YqFM" alt="" data-size="line"> - Input events to URLs from.

### Output data

<img src="/files/qew1VJKqeIQckZYtmzag" alt="" data-size="line"> - List of comma-separated URLs.

***

## Parameters

These are the parameters you need to configure to use this operation (mandatory parameters are marked with a <mark style="color:red;">**\***</mark>):

<details>

<summary>Protocol Must Be Present</summary>

Choose **true** if you want to extract only URLs that start with a protocol (`http://`, `https://`, etc.). The default value is **false**.

</details>

***

## Example

Suppose you want to **extract** all the **URLs** from a given series of events. To do it:

1. In your Pipeline, open the required [Action](/the-workspace/pipelines/actions.md) configuration and select the input **Field**.
2. In the **Operation** field, choose **Extract URLs**.
3. Set **Protocol Must Be Present** to **false**.
4. Give your **Output field** a name and click **Save**.

For example, given this input text:

```
Check the latest sales figures at reports.example.com before tomorrow's call. The customer feedback survey (survey.fictional-company.net) closes Friday. Product documentation has moved to docs.nonexistent-tech.org
```

You'll get the following:

```
reports.example.com,survey.fictional-company.net,docs.nonexistent-tech.org
```

{% hint style="info" %}
You can try out operations with specific values using the **Input** field above the operation. You can enter the value in the example above and check the result in the **Output** field.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.onum.com/the-workspace/pipelines/actions/transformation/field-transformation/field-transformation-operations/extraction/extract-urls.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
