I have a code like this. Here, I want to delete the bad characters in the mailing list. I tried to apply it with the bad_chars()
function.
import pandas as pd
import numpy as np
excelRead = pd.read_excel('mailing.xlsx')
excelRead.dropna(inplace= True)
badCharsList = ["ü", "ı", "ö", "ç", "ş", "ğ", "!", "#", "$", "%", "&", "'", "*", "+", "/", "=", "?" "^", "`", "{", "|", "}", "~", "(",")",",",":",";","<",">","[", " "]
def bad_chars(x):
for i in badCharsList:
if i.lower() not in x.lower():
return i
else:
return np.nan
excelTest = excelRead[excelRead['mails'].str.endswith("@gmail.com", na=False) | excelRead['mails'].str.endswith("@hotmail.com", na=False) | excelRead['mails'].str.endswith("@outlook.com", na=False) | excelRead['mails'].str.endswith("@icloud.com", na=False) | excelRead['mails'].str.endswith("@windowslive.com", na=False) | excelRead['mails'].str.endswith("@yandex.com", na=False) | excelRead['mails'].str.endswith("@mynet.com", na=False) | excelRead['mails'].str.endswith("@hotmail.com.tr", na=False) | excelRead['mails'].str.endswith("@yahoo.com", na=False)]
lower = excelTest['mails'].str.lower()
testBad = excelTest['mails'].apply(bad_chars)
print(testBad)
But this is the output I got. Where do you think I went wrong or how can I achieve this?
Output
0 ü
1 ü
2 ü
3 ü
4 ü
..
107808 ü
107809 ü
107810 ü
107811 ü
107812 ü
Name: mails, Length: 104507, dtype: object
before bad_chars()
function sample output:
0 okanmercannn@hotmail.com
1 06hvm42hotmailcom@gmail.com
2 adanasenol01@gmail.com
3 sezersenturk6305@gmail.com
4 alyasu1903@gmail.com
...
107808 elifyucel2566@gmail.com
107809 yayla19871987@gmail.com
107810 zeynepyilkus@gmail.com
107811 pathoss_theodra@hotmail.com
107812 ziver.7340@gmail.com