4

I'm trying to parse a string and extract some numbers from it. Basically, any 2-3 digits should be matched, except the ones that have "TEST" before them. Here are some examples:

TEST2XX_R_00.01.211_TEST => 00, 01, 211
TEST850_F_11.22.333_TEST => 11, 22, 333
TESTXXX_X_12.34.456      => 12, 34, 456

Here are some of the things I've tried:

(?<!TEST)[0-9]{2,3} - ignores only the first digit after TEST

_[0-9]{2,3}|\.[0-9]{2,3} - matches the numbers correctly, but matches the character before them (_ or .) as well.

I know this might be a duplicate to regex for matching something if it is not preceded by something else but I could not get my answer there.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
PoVa
  • 995
  • 9
  • 24
  • In the second case, Is it not possible for you to capture the required data within a group and fetch it from there? `[_.](\d{2,3})` [LINK](https://regex101.com/r/VKsXwg/1) – Gurmanjot Singh Oct 06 '17 at 06:28
  • Lua does not support regex, only specific Lua patterns. You should tag these questions with `lua-patterns`, *unless* you are using an external regex engine. Are you? – Wiktor Stribiżew Oct 06 '17 at 06:30
  • @WiktorStribiżew you're right, I was experimenting in a browser – PoVa Oct 06 '17 at 06:41
  • @WiktorStribiżew - Please restore the "regex" tag. Searching for "lua" + "regex" is a usual practice to search for Lua pattern-related questions. Actually, nobody (except you) expects that "lua-pattern" tag exists. Tags on SO were introduced to help usual users, they are not for academically correct classification. – Egor Skriptunoff Oct 06 '17 at 07:01
  • @EgorSkriptunoff Ok, let's add `regex` back (since OP tests were made at regex101 and show original logic), but keep `lua-patterns`. – Wiktor Stribiżew Oct 06 '17 at 07:07
  • 1
    @PoVa - Try pattern `%f[T%d]%d+`, it matches every chain of digits not preceded by letter `T`, it may be what you need: `local s = "TEST2XX_R_00.01.211_TEST"; for x in s:gmatch"%f[T%d]%d+" do print(x) end` – Egor Skriptunoff Oct 06 '17 at 09:47

1 Answers1

1

Unfortunately, there is no way to use a single pattern to match a string not preceded with some sequence in Lua (note that you can't even rely on capturing an alternative that you need since TEST%d+|(%d+) will not work in Lua, Lua patterns do not support alternation).

You may remove all substrings that start with TEST + digits after it, and then extract digit chunks:

local s = "TEST2XX_R_00.01.211_TEST"
for x in string.gmatch(s:gsub("TEST%d+",""), "%d+") do  
    print(x)
end

See the Lua demo

Here, s:gsub("TEST%d+","") will remove TEST<digits>+ and %d+ pattern used with string.gmatch will extract all digit chunks that remain.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563