I need to replace german phone numbers in python, which is well-explained here: Regexp for german phone number format
Possible formats are:
06442) 3933023
(02852) 5996-0
(042) 1818 87 9919
06442 / 3893023
06442 / 38 93 02 3
06442/3839023
042/ 88 17 890 0
+49 221 549144 – 79
+49 221 - 542194 79
+49 (221) - 542944 79
0 52 22 - 9 50 93 10
+49(0)121-79536 - 77
+49(0)2221-39938-113
+49 (0) 1739 906-44
+49 (173) 1799 806-44
0173173990644
0214154914479
02141 54 91 44 79
01517953677
+491517953677
015777953677
02162 - 54 91 44 79
(02162) 54 91 44 79
I am using the following code:
df['A'] = df['A'].replace(r'(\(?([\d \-\)\–\+\/\(]+)\)?([ .\-–\/]?)([\d]+))', r'\TEL', regex=True)
The Problem is I have dates in the text:
df['A']
2017-03-07 13:48:39 Dear Sear Madam...
This is necassary to keep, how can I exclude the format: 2017-03-07
and 13:48:39
from my regex replacement?
Short Example:
df['A']
2017-03-077
2017-03-07
0211 11112244
desired output:
df['A']
TEL
2017-03-07
TEL