0

I have a regex which extracts German and Austrian IBANs. However, I just realized, that html code sometimes has strange format.

\b(?:DE|AT)(?:\s?[0-9a-zA-Z]){18}(?:(?:\s?[0-9a-zA-Z]){2})?\b

Therefore I have to exclude not valid IBAN matches. In my demo I show examples which show missmatches. How would you exclude this?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
PParker
  • 1,419
  • 2
  • 10
  • 25
  • 2
    Remove the chars a-z and use a space instead of `\s` like `\b(?:DE|AT)(?: ?[0-9]){18}(?:(?: ?[0-9]){2})?\b` https://regex101.com/r/aWQcKX/1 – The fourth bird Jan 21 '21 at 15:13
  • Thank you very much for your help. This makes sense. Problem solved :-) – PParker Jan 21 '21 at 15:18
  • 1
    Maybe rethink the pattern you have used to extract valid IBAN. [Here](https://stackoverflow.com/a/65735302/9758194) is a start under your own previous question. – JvdV Jan 21 '21 at 16:19

1 Answers1

1

It seems that you only want to match digits. Therefore you can remove a-zA-Z from the character classes. Also note that \s could also match a newline, so if you don't want the match to span over newlines, you can match an optional space instead.

\b(?:DE|AT)(?: ?[0-9]){18}(?:(?: ?[0-9]){2})?\b

See the updated regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70