Updating regex for extracting IBAN's

Question

I have a regex which extracts German and Austrian IBANs. However, I just realized, that html code sometimes has strange format.

\b(?:DE|AT)(?:\s?[0-9a-zA-Z]){18}(?:(?:\s?[0-9a-zA-Z]){2})?\b

Therefore I have to exclude not valid IBAN matches. In my demo I show examples which show missmatches. How would you exclude this?

Remove the chars a-z and use a space instead of `\s` like `\b(?:DE|AT)(?: ?[0-9]){18}(?:(?: ?[0-9]){2})?\b` https://regex101.com/r/aWQcKX/1 — The fourth bird, Jan 21 '21 at 15:13
Thank you very much for your help. This makes sense. Problem solved :-) — PParker, Jan 21 '21 at 15:18
Maybe rethink the pattern you have used to extract valid IBAN. [Here](https://stackoverflow.com/a/65735302/9758194) is a start under your own previous question. — JvdV, Jan 21 '21 at 16:19

score 1 · Accepted Answer · answered Jan 21 '21 at 15:22

It seems that you only want to match digits. Therefore you can remove a-zA-Z from the character classes. Also note that \s could also match a newline, so if you don't want the match to span over newlines, you can match an optional space instead.

\b(?:DE|AT)(?: ?[0-9]){18}(?:(?: ?[0-9]){2})?\b

See the updated regex demo

Updating regex for extracting IBAN's

1 Answers1