2

I want to extract a certrain string from a path. The wanted string is always preceded by either \0_ASW\ or \10_BSW\ words. Additionally, the sought string consists of only letters and numbers.

So for example from the following 3 paths I want to extract strings Mod2000, ModA and ModB:

C:\MyPath\0_ASW\Mod2000
C:\MyPath\10_BSW\ModA\SubDir
C:\MyPath\10_BSW\ModB

For that I have written a regex using Positive Lookbehind:

\\(?<=(0_ASW|10_BSW)\\)([A-Za-z0-9]+)

With this regex the 2nd group matches the sought string correctly and I am able to compile the regex in .NET(C#) without any errors. However, once I try to compile it in Python I get the following Regex Error: A lookbehind assertion has to be fixed width

From my understanding, the two words in the positive lookbehind, i.e. 0_ASW and 10_BSW ought to have the fixed length. The error is not clear to me because both words have a fixed length of 4 and 5 characters, respectively. If I try to make those 2 strings to have equal length, e.g. 3 character strings ASW and BSW, the regex compiles without the above error.

\\(?<=(ASW|BSW)\\)([A-Za-z0-9]+)

How do I fix this regex so that it compiles in Python as well?

You can find the demos here:

https://regex101.com/r/qfwfJJ/1

https://regex101.com/r/zAVk5Z/1

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
SimpleThings
  • 147
  • 2
  • 12

3 Answers3

4

You could also use a non-capturing group:

\\(?:0_ASW|10_BSW)\\(\w+)

https://regex101.com/r/hYCRJf/1

If the regex matches, you'll get the desired string in group(1).

Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
2

You can use a lookahead like this with an alternation, as for Python it has to be fixed width which they are not in your pattern.

\b(?:(?<=\\0_ASW\\)|(?<=\\10_BSW\\))[A-Za-z0-9]+

See a regex101 demo.


If you can make use of the PyPi regex module, you match what you want then then you can use \K to forget what is matches so far:

\\(?:0_ASW|10_BSW)\\\K[A-Za-z0-9]+

See another regex101 demo.

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
1
\\((0_ASW|10_BSW)\\)([A-Za-z0-9]+)

https://regex101.com/r/e7vH34/1

tomasborrella
  • 480
  • 2
  • 8