0

We have a custom field defined in data studio which extracts and returns the file extension from a string in this case it is from the event label.

I have been using the below with some success

REGEXP_EXTRACT(Event Label, '\\.([\\w\\.-]+)$')

However I'm finding if the string contains multiple periods its including that aswell

Eg it's also extracting text like

  • 07.21.pdf
  • 7.22.PDF
  • 07.21.docx
  • docx.pdf

How can I tweak my regex to only include from the last period and ignore any earlier.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Taz
  • 28
  • 8

1 Answers1

1

You could try replacing [\\w\\.-] with [^\\.]

\\.([^\\.]+)$

[^\\.] will match everything except for ., so the match will not be able to contain dots inside.

The full formula would look like this:

REGEXP_EXTRACT(Event Label, '\\.([^\\.]+)$')
Nimantha
  • 6,405
  • 6
  • 28
  • 69
ZygD
  • 22,092
  • 39
  • 79
  • 102