Please please please help. I've been strugglign with this for a while and ran into problem after problem. I'm just trying to make a loop that opens every csv file in a folder. Here's the loop:
folder = '/Users/jolijttamanaha/Documents/Senior/Thesis/Python/TextAnalysis/datedmatchedngrams2/'
for file in os.listdir (folder):
with codecs.open(file, mode='rU', encoding='utf-8') as f:
m=min(int(line[1]) for line in csv.reader(f))
f.seek(0)
for line in csv.reader(f):
if int(line[1])==m:
print line
Here's the error:
Traceback (most recent call last):
File "findfirsttrigram.py", line 11, in <module>
m=min(int(line[1]) for line in csv.reader(f))
File "findfirsttrigram.py", line 11, in <genexpr>
m=min(int(line[1]) for line in csv.reader(f))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 684, in next
return self.reader.next()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 615, in next
line = self.readline()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 530, in readline
data = self.read(readsize, firstline=True)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 477, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x87 in position 0: invalid start byte
I got here because first I had a "Null Byte" error, which I solved with this post: "Line contains NULL byte" in CSV reader (Python)
Then I got an integer error, which I solved with this post "an integer is required" when open()'ing a file as utf-8?
Then I got an error that said: 'UnicodeException: UTF-16 stream does not start with BOM' which I solved with this post utf-16 file seeking in python. how?
Then I realized that the csv module requires utf-8 so here I am.
But I've finally hit the limit of the existing questions. I can't figure out what is going on. Please please help.