I want to remove some characters (stored in a list) from all of the dataframe without iterating through rows.
I have this list:
speChars = ['[1]', '[2]', '[3]', '[4]', '[5]', '[6]', '[7]', '[8]', '[9]',
'[10]', '[11]', '\[]', '\[', '\]', '\\n']
removed special characters from column names using this code:
for char in speChars:
NASA_CAS_Data.columns = NASA_CAS_Data.columns.str.replace(char, '')
& now I want to delete all special characters in rows without mentioning all col names, tried iloc
but it didn't worked because of str
:
for char in speChars:
NASA_CAS_Data.iloc[0:100, 0:100] = NASA_CAS_Data.iloc[0:100, 0:100].replace(char, '')
is there any simpler & faster way to remove special characters from all rows of a CSV dataframe in pandas?
Answer
I'm somewhat new to StackOverflow & after editing my post, "Answer your question" button suddenly disappeared
I created a list containing names of the columns rather than using loc
or iloc
that cause <AttributeError: 'str' object has no attribute 'iloc'>, then iterated through it. I also forgot to use escape character \
before the numbers! it was deleting all numbers too (e.g. deleted 876 & [7] instead of just [7] from the first row & third column in the picture).
final code:
df = pd.read_csv("/cleanedNASA.csv")
speChars = ['\[1]', '\[2]', '\[3]', '\[4]', '\[5]', '\[6]', '\[7]',
'\[8]', '\[9]', '\[10]', '\[11]', '\[]', '[', ']', '\\n']
COList= df.columns.tolist()
for char in speChars:
for col in COList:
df[col] = df[col].str.replace(char, '')