i have a pandas dataframe contains many columns like Name, Email, Mobile Number etc. . which looks like this :
Sr No. Name Email Mobile Number
1. John joh***@gmail.com 1234567890,2345678901
2. kylie k.ki**@yahoo.com 6789012345
3. jon null 1234567890
4. kia kia***@gmail.com 6789012345
5. sam b.sam**@gmail.com 4567890123
I want to remove the rows which contains same Mobile Number. One person can have more than one number. I done this through drop_duplicates function. I tried this:
newdf = df.drop_duplicates(subset = ['Mobile Number'],keep=False)
Here is output :
Sr No. Name Email Mobile Number
1. John joh***@gmail.com 1234567890,2345678901
3. jon null 1234567890
5. sam b.sam**@gmail.com 4567890123
But the problem is it only removes the rows which are exactly same. but i want to remove the row which contains at least one same number i.e Sr. No. 1 and 3 they have one same number. How can i remove them so the final output looks like this :
final output:
Sr No. Name Email Mobile Number
5. sam b.sam**@gmail.com 4567890123