I have some invalid characters in my file that I'm trying to remove. But I ran into a strange problem with one of them.
When I try to use the replace function then I'm getting an error SyntaxError: EOL while scanning string literal.
I found that I was dealing with \x1d
which is a group separator. I have this code to remove it:
import pandas as pd
df = pd.read_csv('C:/Users/tkp/Desktop/Holdings_Download/dws/example.csv',index_col=False, sep=';', encoding='utf-8')
print(df['col'][0])
df = df['col'][0].encode("utf-8").replace(b"\x1d", b"").decode()
df = pd.DataFrame([x.split(';') for x in df.split('\n')])
print(df[0][0])
Output:
Is there another way to do this? Because it seems to me that I couldn't do it any worse this.