-2

I am trying to extract the German VAT number (Umsatzsteuer-Identifikationsnummer) from a text.

string = "I want to get this DE813992525 number."

I know, that the correct regex for this problem is (?xi)^( (DE)?[0-9]{9}|)$. It works great according to my demo.

What I tried is:

string = "I want to get this DE813992525 number.
match = re.compile(r'(?xi)^( (DE)?[0-9]{9}|)$')
print(match.findall(string))

>>>>>> []

What I would like to get is:

print(match.findall(string))
>>>>>  DE813992525
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
PParker
  • 1,419
  • 2
  • 10
  • 25
  • 1
    Why not just `^DE[0-9]{9}$`https://regex101.com/r/FDuzNE/1 See https://ideone.com/nRaAXx – The fourth bird Sep 10 '20 at 11:44
  • 1
    no, it's [not correct one](https://regex101.com/r/yMFxD7/1) - e.g. `$` anchor means end of string and your test string VAT number is not at the end. – buran Sep 10 '20 at 11:44

1 Answers1

1

When searching within a string, dont use ^ and $:

import re
string = """I want to get this DE813992525 number.
I want to get this DE813992526 number.
"""
match = re.compile(r'DE[0-9]{9}')
print(match.findall(string))

Out:

['DE813992525', 'DE813992526']
Maurice Meyer
  • 17,279
  • 4
  • 30
  • 47