0

I am trying to match German company titles with Python regex. Here is what I have:

(\bUnternehmergesellschaft\b|\bUG haftungsbeschränkt\b|\bUG (haftungsbeschränkt)\b|\bUG\b|\bUnternehmergesellschaft und Compagnie Kommanditgesellschaft\b|\bUG & Co. KG\b)

As you can see in the demo, I do not have a perfect match for each title. Especially when in comes to . or ( and ) it doesn't work.

PParker
  • 1,419
  • 2
  • 10
  • 25
  • 1
    1) Escape special chars like `(` and `)`, etc. 2) Make sure you understand what `\b` matches in regex. `a\)\b` will only match in a string like `a)b` and not in `a);`. – Wiktor Stribiżew Jan 25 '21 at 10:35
  • Thanks, but how can I close a bracket in regex? https://regex101.com/r/7PqFq8/1 I tryied different versions... – PParker Jan 25 '21 at 10:49
  • 1
    `\bUG \(haftungsbeschränkt\)`. If you are adding word boundaries dynamically, please [read the answer](https://stackoverflow.com/a/45145800/3832970). – Wiktor Stribiżew Jan 25 '21 at 10:51
  • I think I still have a fundamental problem in understanding. In order to solve my problem above, I need to understand, why this( https://regex101.com/r/7PqFq8/2) doesn't work – PParker Jan 25 '21 at 12:00
  • 1
    See [Order of regular expression operator (..|.. … ..|..)](https://stackoverflow.com/questions/35606426/order-of-regular-expression-operator), and [it will work](https://regex101.com/r/g5bO26/1). – Wiktor Stribiżew Jan 25 '21 at 12:48

0 Answers0