PCL (Parser Configuration Language)
Introduction
PCL is a language designed to extract data from a line of text by describing its structure. The language aims to use a concise and intuitive syntax to help visualize the structure of a line of text.
PCL expressions are used to configure the Parser Action.
Syntax basics
A valid PCL expression must be composed of one or more fields, as long as there are separators between them. That is, it must follow this rule:
delimiter? fixedLength* field(delimiter fixedLength* field)* delimiter?
Where a delimiter
could be a literal or an operator. This last one could optionally have surrounding literals.
When using groups, the PCL behaviour can change, as groups are a special type of field that can be written next to other fields without a delimiter. Check the Group section below to learn more about this.
At the moment, the only possible fixed-length field is a string.
Valid example
{myFieldOne:string} {myFieldTwo:int}<while(value=" ")>{myCsv:csv(fields=[0,2],separator=",")}
Invalid example (no delimiters)
{myFieldOne:string}{myFieldTwo:int}
The grammar supports any kind of name that is written with the set of characters A-Z
, a-z
, 0-9
and the symbol underscore (_
). It supports field aliases with any name written with the set of characters A-Z
, a-z
, 0-9
, _
, -
, #
and .
(given that the first character is not _
).
Syntax fields
In PCL, we can write any sequence of fields. The type of fields can be the following:
Learn more about each field option in the Field options section below.
CSV
CSV
is a configurable field. The available parameters are:
alias
Rename the field name.
indices
Select which columns you want to extract from the CSV.
separator
Define the separator of the columns.
totalColumns
Indicate the number of columns of your CSV.
Note that the totalColumns
parameter is mandatory when there is a delimiter after the field that is equal to the CSV separator. For example, a CSV with 3 columns and a JSON separated by a comma:
1,2,3,{"hello":"world"}
Examples
{myFieldName:csv(indices=[0,1,3],totalColumns=4,separator=",")}
{myFieldName:csv(indices=[0,1,3],totalColumns=4,separator=",", alias="newCsvName")}
{myFieldName:csv(indices=[0:string(alias="csvFieldName1"), 1:string(alias="csvFieldName2")], alias="newCsvNames")
Float
Float
is a configurable field. The available parameters are:
alias
Rename the field name.
decimalSeparator
Define the separator to the decimal.
thousandSeparator
Define the separator to the thousands.
Note that the decimalSeparator
and thousandSeparator
parameters cannot contain the same value.
Example
{myField:float(decimalSeparator=".")}
Group
A Group
is a special type that might contain two or more of the following simple types:
Float
Integer
Separator
String
Groups cannot be used inside other groups. The available parameters are:
optional
Everything inside the group marked with this option could or not be in the log to parse.
Optionally, groups can have their type defined.
Examples
{myGroupName:{{myFieldOne:string} {myFieldTwo:int} {myFieldThree:int}}}
{myGroupName:{_{myFieldOne:string} {myFieldTwo:int} {myFieldThree:int}_}}
{myGroupName:group{_{myFieldOne:string} {myFieldTwo:int} {myFieldThree:int}_}}
{myGroupName:group(optional=true){_{myFieldOne:string} {myFieldTwo:int} {myFieldThree:int}_}}
The concatenation of simple types inside a group behaves in the same way as a normal PCL, always taking into account the restrictions of which types can be used.
For a PCL containing a group field to be considered valid, the concatenation of fields/delimiters surrounding the group and the inner content of the group must form a valid PCL. Therefore, when unwrapping the inner PCL and joining it with the outside PCL, it must be valid. This means groups don't need to be separated from other fields by delimiters, as long as the resulting PCL is valid. This is because delimiters can be found at the start or end of a group.
For example, given the following valid PCL:
{stringField:string}{myGroupName:group{_{myFieldOne:string} {myFieldTwo:int} {myFieldThree:int}_}}
It could be a valid PCL as the concatenation will result in the following:
{stringField:string}_{myFieldOne:string} {myFieldTwo:int} {myFieldThree:int}_
However, the following PCL would be considered invalid:
{stringField:string}{myGroupName:group{{myFieldTwo:int} {myFieldThree:int}_}}
As the result of the concatenation will result in two fields being together:
{stringField:string}{myFieldOne:string} {myFieldTwo:int} {myFieldThree:int}_
This also applies when using the optional
operator. The different PCLs generated containing or not the optional group must be valid. A valid example would be:
{stringField:string}{myGroupName:group(optional=true){_{myFieldOne:string}}}{myGroupName:group(optional=true){_{myFieldOne:string}}}
All the possible PCLs have their fields correctly separated by delimiters. On the other hand, if we had a PCL like the following, it would be considered invalid:
{stringField:string}{myGroupName:group(optional=true){_{myFieldOne:string}}}{myGroupName:group(optional=true){{myFieldOne:string}}}
As one of the possible PCLs would be:
{stringField:string}_{myFieldOne:string}{myFieldOne:string}
There are two fields together, which is not considered valid.
Integer
Integer
is a configurable field. The available parameters are:
alias
Rename the field name.
thousandSeparator
Define the separator to the thousands.
Example
{myFieldName:int(thousandSeparator=",")}
JSON
JSON
is a configurable field. The available parameters are:
alias
Rename the field name.
thousandSeparator
Select which items you want to extract from the JSON.
Examples
{myFieldName:json(fields=["itemOne","itemTwo"])}
{myfield:json(fields=["hello ":string(alias="hello_"), "bye ":string(alias="bye_")])}
Key-value list
Key-value list
is a configurable field. The available parameters are:
alias
Rename the field name.
kvSeparator
Define the separator between keys and values.
listSeparator
Define the separator between each key-value item.
indices
Select which columns you want to extract from the list by their position in the list.
fields
Select which items you want to extract from the list by their key names.
The indices
and fields
operators cannot be used simultaneously.
Examples
{myFieldName:keyValueList(kvSeparator=":",listSeparator=",")}
{myFieldName:keyValueList(fields=["hello ":string(alias="hello_")
String
String
is a configurable field. The available parameters are:
alias
Rename the field name.
length
Define the length of the string.
escapableChar
Escape delimiter characters in the string.
Examples
{myField:string(length=2)}
{myField:string(length=2, alias="newFieldName")}
There is a special case with String
fields. If the field is using the length
parameter, we may add another field next to it without any separator:
{oneField:string(length=2)}{anotherField:string(length=3)}{lastField:string}
XML
XML
is a configurable field. The available parameters are:
alias
Rename the field name.
xpaths
Select which items you want to extract from the XML using a subset of the XPath query language.
Examples
{myFieldName:xml(xpaths=["/data/event","/data/event@id"])}
{myfield:xml(xpaths=["/data/name":string(alias="name"), "/data/event":listString])}
Syntax literals
A literal is a special type of element. This is a string that must exist between two fields. Unlike other fields, the literals are just a string that may contain one or more characters except <
, >
, {,
or }
, unless they are escaped with \
This is an example of a literal (whitespace
):
{myFieldOne:string} {myFieldTwo:string}
Syntax operators
There are two types of operators:
Skip
The Skip
operator acts like a dynamic separator. It can be used when we want to skip any content until we find a coincidence.
This is a configurable operator that is equivalent to the regular expression (?:from)*(?=to)
where from
and to
are the strings to match.
The available parameters are:
from
*
Define the string to find one or more times.
to
*
Define the string to insert one or more times.
Example
<skip(from=" ",to="-")>
A use case could be to skip all characters until a JSON is found. For example, for this log:
hello thisisrubbish{"my": "json"}
We could use this PCL:
{f1:string}<skip(from=" ", to="{")> {f2:json}
While
The While
operator acts like a dynamic separator. It is useful if a separator has an unknown number of repetitions on each log.
This is a configurable operator that is equivalent to the regular expression (?:value)*
where value
is the string to match. However, if the options min
and/or max
are defined, then the equivalent regular expression is (?:value)*{- min,max}
The available parameters are:
value
*
Define the string to find one or more times.
max
Set the maximum number of repetitions of value
(must be greater than 0).
min
Set the minimum number of repetitions of value
(must be greater than 0).
It is not necessary to define both max
and min
. However, if both are defined, then it must assert that min
is strictly lower than max
, that is, min < max
.
<while(value=" ",max=2)>
In another example, let -
as a separator that appears at least 3 times in all logs:
hello - - -world
goodbye - - - - -world
hello - - - -moon
Then, the PCL could be {f1:string}<while(value=" -", min=3)>{f2:string}
Field options
alias
The value must follow the naming requirements:
The allowed set of characters is:
A-Z
,a-z
,0-9
,.
,-
,_
or#
.An alias cannot start with
_
.
These are valid examples:
alias="myNewName"
alias="my-new-name"
These are invalid examples:
alias="_myNewName"
alias="my new name"
default
The value must be of the same type as the parent. For example, if it is the default value of an integer, then the default value must be an integer too.
These are valid examples:
{myfield:json(fields=["hello":string(default="{}")])}
{myfield:csv(fields=["world":int(default=-1)])}
These are invalid examples:
{myfield:json(fields=["hello":string(default=-1)])}
{myfield:float(default=1.5)}
decimalSeparator
The value could be:
,
.
(default value)
These are valid examples:
decimalSeparator=","
decimalSeparator="."
These are invalid examples:
decimalSeparator=""
decimalSeparator="-"
decimalSeparator="_"
fields
The value must be a list of strings. Note that the list cannot contain other values (e.g. numbers).
Additionally, we may specify the type of each field by writing a colon (:
) followed by the type: bool
, float
, int
or string
. For example: fields=["oneField":bool, "middleField", "anotherField":int]
. If the type is omitted, it should be assumed that the type is string
. In the previous example, it assumes that middleField
is a string.
Each sub-type may have these options:
alias
(optional) to rename the field name.default
(optional) to set a fixed value if the field does not exist in the log.
These are valid examples:
fields=["oneField","anotherField.with.subField"]
fields=["oneField":string(alias="anotherName")]
fields=[]
These are invalid examples:
fields=[oneField,anotherField]
fields=["oneField,anotherField"]
fields=[0,1]
indices
The value must be a list with numbers. Note that the list cannot contain other values apart from positive integers (including zero).
Additionally, we may specify the type of each index by writting a colon (:
) followed by the type: bool
, float
, int
or string
. For example: indices=[0:bool, 1, 3:int]
. If the type is omitted, it assumes that the type is string
. In the previous example, it assumes that 1
is a string.
Each sub-type may have these options:
alias
(optional) to rename the field name.default
(optional) to set a fixed value if the field does not exist in the log.
These are valid examples:
indices=[0,1,3]
indices=[1:string(default="not exists")]
indices=[]
These are invalid examples:
indices=["0","1"]
indices=[-3,1]
kvSeparator
The value can be as long as needed, there is no character limit. By default, it is =
.
These are valid examples:
kvSeparator=":"
kvSeparator="\t"
kvSeparator="hello"
These are invalid examples:
kvSeparator=""
kvSeparator=:
Note that "
must be escaped. For example: kvSeparator="\""
length
The value must be a strictly positive integer.
These are valid examples:
length=1
length=25
These are invalid examples:
length="1"
length=0
length=-3
listSeparator
The value must be a non-empty text. By default, it is ,
These are valid examples:
listSeparator=";"
listSeparator="|"
listSeparator="hello"
These are invalid examples:
listSeparator=""
listSeparator=;
Note that "
must be escaped. For example: listSeparator="\""
.
separator
The value must be a character from the set: |
, ;
, ,
, \t
. By default, it is ,
.
These are valid examples:
separator=";"
separator="\t"
These are invalid examples:
separator="-"
separator=;
totalColumns
The value must be a strictly positive integer.
These are valid examples:
totalColumns=1
totalColumns=5
A valid value for this option must equal the number of columns in the CSV.
These are invalid examples:
totalColumns="1"
totalColumns=0
totalColumns=-3
thousandSeparator
The value could be:
empty string (default value).
,
.
These are valid examples:
thousandSeparator=""
thousandSeparator="."
These are invalid examples:
thousandSeparator="-"
thousandSeparator="_"
Use case
message: "foo|bar|"foo|bar"|another field after the CSV"
A valid expression to parse the message is:
{fieldName1:csv(separator="|",totalColumns=3)}|{fieldName2:json()}
or
{csvField:csv(separator="|",indices=[0,1,2],totalColumns=3)}|{stringField:string}
Examples
CSV with 3 columns
We have the following log:
foo|bar|\"foo|bar\"|{"hello": "world"}
A valid expression to parse the message is:
{fieldName:csv(separator="|",totalColumns=3)}|{fieldName2:json()}
or
{csvField:csv(separator="|",indices=[0,1,2],totalColumns=3)}|{stringField:string}
Key-value list with duplicated keys
We have the following log:
key1=value1 key2=value2 key3=3 key1=anotherValue1
A valid expression to parse the message is:
{field:keyValueList(kvSeparator="=", listSeparator=" ", fields=["key1":listString(alias="key1AsList"), "key2":string(), "key3":int()])}
And it would extract these values:
field: "key1=value1 key2=value2 key3=3 key1=anotherValue1"
key1AsList:
- value1
- anotherValue1
field.key2: "value2"
key3: 3
Unknown repetitions
A valid PCL expression could be:
{lastName:string}, {firstName:string}<while(value=" ")>{age:int}: {info:json(fields=["country","occupation"])}
Here, there are 4 fields (lastName
, firstName
, age
and info
) and 3 delimiters (,
, :
and the operator while
).
This PCL expression can be used to extract fields from different lines of text that have the same structure. For example, given the text Doe, John 37: {"country": "UK", "occupation": "father"}
, the PCL expression can be used to extract the following fields:
lastName: "Doe"
firstName: "John"
age: 37
info: "{\"country\": \"UK\", \"occupation\": \"father\"}"
info.country: "UK"
info.occupation: "father"
In another example, given the text Smith, Jane 19: {"country": "USA", "occupation": "student"}
, the same PCL expression would extract:
lastName: "Smith"
firstName: "Jane"
age: 19
info: "{\"country\": \"USA\", \"occupation\": \"student\"}"
info.country: "USA"
info.occupation: "student"
String after a CSV
We have the following log:
foo|bar|"foo|bar"|another field after the CSV
A valid expression to parse the message is:
{fieldName1:csv(separator="|",totalColumns=3)}|{fieldName2:string}
or
{csvField:csv(separator="|",indices=[0,1,2],totalColumns=3)}|{stringField:string}
Syslog message
We have the following log:
<165>1 2003-10-11T22:14:15.003Z myhostname myapp 1234 ID47 - An application event log entry...
A valid PCL expression to parse this log would be
<while(value=" ",max=2)>\<{priority:string}\>{version:int} {eventtimestamp:string} {hostname:string} {appName:string} {procId:string} {msgId:string} - {msg:string}
XML that contains several metadata values
We have the following log:
<event>
<date timezone="UTC">2025-02-27</date>
<metadata name="ProcId">1</metadata>
<metadata name="UserId">123</metadata>
<log>user logged in</log>
<log>user update registry</log>
<log>user logged out</log>
</event>
A valid PCL expression to parse this log would be:
{field:xml(xpaths=["/event/date","/event/date@timezone","/event/metadata":map(key="/@name",outputFormat="json"),"/event/log":listString])}
This would extract the following data:
field: "..." # Full XML here
field.event.date: "2025-02-27"
field.event.date#timezone: "UTC"
field.event.metadata: "{\"ProcId\":\"1\",\"UserId\":\"123\"}"
field.event.log:
- "user logged in"
- "user updated a registry"
- "user logged out"
XML that contains a list
We have the following log:
<data>
<log>user logged in</log>
<log>user updated a registry</log>
<log>user logged out</log>
</data>
A list can be extracted with the following PCL expression:
{field:xml(xpaths=["/data/log":listString)])}
And it would extract the following information:
field: "..." # Full XML here
field.data.log:
- "user logged in"
- "user updated a registry"
- "user logged out"
XML that contains a list of objects
We have the following log:
<event>
<metadata>
<field>procId</field>
<value>1234</value>
</metadata>
<metadata>
<field>userId</field>
<value>4321</value>
</metadata>
<metadata>
<field>message</field>
<value>hello!</value>
</metadata>
</event>
A valid PCL expression to parse this log would be:
{field:xml(xpaths=["/event/metadata":map(key="/field",value="/value",outputFormat="json")])}
This would extract the following data:
field: "..." # Full XML here
field.event.metadata: "{\"procId\":\"1234\",\"userId\":\"4321\",\"message\":\"hello!\"}"
Last updated
Was this helpful?