I have an input csv file and when I try to do some operations on it and make an output file, I am getting this error.
At first I got the 'utf-8' Error so I searched and checked the encoding of my file with this:
import chardet
with open('1out_test.csv', 'rb') as rawdata:
result = chardet.detect(rawdata.read(100000))
result
Output: {'confidence': 1.0, 'encoding': 'ascii'}
Then I wrote the following:
WORDS, N = ["aaaa", "tttt"], 1
pattern = (
rf"((?:\S+ +){{0,{N}}}\S*"
fr"\b(?:{'|'.join(map(re.escape, WORDS))})\b"
rf"\S*(?: +\S+){{0,{N}}})"
)
pd.read_csv("1out_test.csv", encoding='ascii', low_memory=False).assign(info=lambda x: x["remarks"].str.extract(pattern,flags= re.IGNORECASE, expand=False).fillna("NA")).to_csv("output.csv", index=False)
This again gave me the same error but with 'ascii': 'ascii' codec can't decode byte 0xe2 in position 31: ordinal not in range(128)
NOTE: In both the errors, the position 31 was the same.