How to validate mobile number

Question

i have df like this

   Contact Number
0   
1   NaN
2   6363887122.0
3   6363887122.0

I WANT THIS

    Contact Number  Status_contactNUmber   Invalid_contactNUmber
0                       Blank/Null                 True
1   NaN                 Blank/Null           True             
2   6363887122           Valid               False
3   6363887122           Valid               False

I try with this

def contactNumber(ele):
    if (pd.isna(ele) or (ele=='')):
        return ("Blank/Null",True)
    elif re.search(r'^([0]|\+91)?[6789]\d{9}$',ele):
#     elif ele.str.contains(r'^([0]|\+91)?[6789]\d{9}$'):
        return ("Valid",False)
    else:
        return ("invalid",True)
df[['Status_contactNUmber','Invalid_contactNUmber']] = df['Contact Number'].apply(contactNumber).tolist()

but give the Error because Contact Number column in Float type

Use `df["Contact Number"].astype(int)` to get those values as integers. — Shubham, May 29 '21 at 06:17
This has previously been answered on SO [here](https://stackoverflow.com/questions/41550746/error-using-astype-when-nan-exists-in-a-dataframe/41550787). I would suggest you use the `fillna()` method to replace NaNs with invalid values. Or you can use pandas' **Int64** which does allow NaNs as mentioned in the post linked — Shubham, May 29 '21 at 06:21

score 0 · Answer 1 · answered May 29 '21 at 06:18

0

please change your column type string from float

answered May 29 '21 at 06:18

Satyendra Yadav

122
4

Then but then Going to invalid bcz regex not match – May 29 '21 at 06:21

score 0 · Accepted Answer · answered May 29 '21 at 08:40

First of all, the 0 you see at top of your df's index is actually the name of the index, and not the first row. First row in your df starts from index = 1 (the NaN value). You can also understand it by the fact that if Contact Number column is of type Float, then how can it have a "" value? (It will have NaN, just like it has there at index = 1).

I confirmed this by copying your df and checking its index (see the name '0' and index starting from 1):

>>> df = pd.read_clipboard('\s\s+')
>>> df.index
Int64Index([1, 2, 3], dtype='int64', name='0')

So now coming to what you want, you can do this by handling it in your function. Just convert the ele to int type first to remove .0 from phone numbers and then convert to str type for regex matching:

def contactNumber(ele):
    if (pd.isna(ele)):
        return ("Blank/Null",True)
    elif re.search(r'^([0]|\+91)?[6789]\d{9}$', str(int(ele))):
#     elif ele.str.contains(r'^([0]|\+91)?[6789]\d{9}$'):
        return ("Valid",False)
    else:
        return ("invalid",True)

You don't need the (ele=='') condition, because as stated, float type columns will not have blank strings.

Output:

>>> df[['Status_contactNUmber','Invalid_contactNUmber']] = df['Contact Number'].apply(contactNumber).tolist()
>>> df
   Contact Number Status_contactNUmber  Invalid_contactNUmber
0                                                            
1             NaN           Blank/Null                   True
2      6363887122              Valid                    False
3      6363887122              Valid                    False

How to validate mobile number

2 Answers2