I have little code which using regex
and here I'm trying to make my records to be with lowercase and without any punctuations in it, but in further situation I have error
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 5387: character maps to
<undefined>
I want to extract Record ID
and Title
for the records with Languages
English
import csv
import re
import numpy
filename = ('records.csv')
def reg_test(name):
reg_result = ''
with open(name, 'r') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
row = re.sub('[^A-Za-z0-9]+', '', str(row))
reg_result += row + ','
if (row['Languages'] == 'English')
return reg_result
print(reg_test(filename).lower())