0

I am trying to replace all international phone numbers in a python column (mostly European ones).

Currently I have:

df['A'] = df['A'].replace('^(?:(?:\+?1\s*(?:[.-]\s*)?)?(?:\(\s*([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9])\s*\)|([2-9]1[02-9]|[2-9][02-8]1|[2-9][02-8][02-9]))\s*(?:[.-]\s*)?)?([2-9]1[02-9]|[2-9][02-9]1|[2-9][02-9]{2})\s*(?:[.-]\s*)?([0-9]{4})(?:\s*(?:#|x\.?|ext\.?|extension)\s*(\d+))?$',r'\Tel', regex=True)

I am following this well-known question on here: A comprehensive regex for phone number validation

But somehow, this phone number: 04265-217866 is not reached by this. Any ideas how to tune it?

The data in df['A'] is german text data and looks like:

df['A']
Sehr geehrter Herr... Mit freundlichen Grüßen 0049-172 387898
Ich hoffe ich konnte helfen 0021 111789
Sie erreichen mich unter 04265-217866

The desired outcome in this case would be:

Sehr geehrter Herr... Mit freundlichen Grüßen Tel
Ich hoffe ich konnte helfen Tel
Sie erreichen mich unter Tel

My phone numbers are European(german) numbers:

0049 231 184989
+49 231 184989
0049231184989
0231 - 184989

and more

jnovack
  • 7,629
  • 2
  • 26
  • 40
PV8
  • 5,799
  • 7
  • 43
  • 87
  • 1
    what does the data in `df['A']` look like? can we take a peek? – MEdwin Oct 28 '19 at 11:56
  • 2
    The regex does not match numbers like yours. It is not clear what your phone number requirements are. So, no idea what regex you need. – Wiktor Stribiżew Oct 28 '19 at 11:59
  • try this one: ^011(9[976]\d|8[987530]\d|6[987]\d|5[90]\d|42\d|3[875]\d| 2[98654321]\d|9[8543210]|8[6421]|6[6543210]|5[87654321]| 4[987654310]|3[9643210]|2[70]|7|1)\d{0,14}$ You can replace the \ds with [0-9] if Your regex syntax doesn't support \d. – Ashish Kumar Saxena Oct 28 '19 at 12:05

0 Answers0