0

I have a string with 5 pieces of data delimited by underscores:

AAA_BBB_CCC_DDD_EEE

I want a different regex for each component. The regex needs to return just the one component. For example, the first would return just AAA, the second for BBB, etc.

I am able to parse out AAA with the following:

^([^_]*)?

I see that I can do a look-around like this to find:

(?<=[^_]*_).*
BBB_CCC_DDD_EEE

But the following can not find just BBB

(?<=[^_]*_)[^_]*(?=_)
George Hernando
  • 2,550
  • 7
  • 41
  • 61

2 Answers2

0

Mixing lookbehind and lookahead

^([^_]+)? // 1st
(?<=_)[^_]+ // 2nd
(?<=_)[^_]+(?=_[^_]+_[^_]+$) // 3rd
(?<=_)[^_]+(?=_[^_]+$) // 4th
[^_]+$ // 5th

Just if the lengths of the strings beetween the "_" are known it can be like this

1st match

^([^_]+)?

2nd match

(?<=_)\K[^_]+

3rd match

(?<=_[A-Za-z]{3}_)\K[^_]+

4th match

(?<=_[A-Za-z]{3}_[A-Za-z]{3}_)\K[^_]+

5th match

(?<=_[A-Za-z]{3}_[A-Za-z]{3}_[A-Za-z]{3}_)\K[^_]+

each {3} is expressing the length of the string beetween "_"

AndresDLRG
  • 136
  • 7
0

If your string is always uses underscores, you might use 1 regex to capture your values in a capturing group by repeating the pattern of what is before (in this case NOT an underscore followed by an underscore) using a quantifier which you can change like {3}.

This way you can specify using the quantifier how many times you want to repeat the pattern before and then capture your match. For your example string AAA_BBB_CCC_DDD_EEE you could use {0}, {1},{2},{3} or {4}

^(?:[^_\n]+_){3}([0-9A-Za-z]+)(?:_[^_\n]+)*$

That would match:

  • ^ Assert position at start of the line
  • (?:[^_\n]+_){3} In a non capturing group (?:, match NOT and underscore or a new line one or more times [^_\n]+ followed by an underscore and repeat that n times (In this example n is 3 times)
  • ([0-9A-Za-z]+) Capture your characters in a group using for example a character class (or use [^_]+ to match not an underscore but that will also match any white space characters)
  • (?:_[^_\n]+)* Following after your captured values, repeat in a non capturing group matching an underscore, NOT and underscore or a new line one or more times and repeat that pattern zero or more times to get a full match
  • $ Assert position at the end of the line
The fourth bird
  • 154,723
  • 16
  • 55
  • 70