I have the following code in python(pandas), databricks. This is working fine but it is not filtering out the invalid phone numbers.
The code follows the pattern and filters out home and mobile phone numbers
import pandas as pd
import re
from pyspark.sql.functions import lit
df = Phonevalidation
# function to check the phone number pattern
def isValid(s):
Pattern = re.compile("(0|44)?[7-9][0-9]{9}")
if(Pattern.match(s)):
return 'Mobile Number'
else: return 'Home phone'
#UDF Register
PhType = udf(isValid)
df1 = Phonevalidation.withColumn('Phtype' ,PhType('Phonenumber') )
display(df1)
I am expecting to filter out invalid phone number with length >10 or <10 or numbers like 0000000 or 11111 to be tagged as invalid