I'm trying to read a GloVe file: glove.twitter.27B.200d.txt
. I have the next function to read the file:
def glove_reader(glove_file):
glove_dict = {}
with open(glove_file, 'rt', encoding='utf-8') as glove_reader:
for line in glove_reader:
tokens = line.rstrip().split()
vect = [float(token) for token in tokens[1:]]
glove_dict[tokens[0]] = vect
return glove_dict
The problem is that I get the next error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xea in position 0: invalid continuation byte
I tried with latin-1
but it didn't work. Throws me the next error:
ValueError: could not convert string to float: 'Ù\x86'
I also tried change 'rt'
with 'r'
and 'rb'
. I think is a problem of macOS because in Windows didn't throw me this error. Can someone please help me to know why I can't read this file.