I'm stuck and I feel stupid.
I've got a database with Tweets which I'm exporting to a .CSV using .NET. I'd like to analyze this data using Python using Pandas and NLTK. However I'm totally stuck on the first step, which is: 'reading the CSV in Python'. This led to this soup of problems: Python open CSV file with supposedly mixed encodings?
It can't be so hard to just open a file and print the text if I'm the one creating the textfile?
I'm using the following C#
code to generate the CSV file (supposedly in UTF8
?)
using (FileStream fs = new FileStream(fullFileName, FileMode.Append, FileAccess.Write)) using (StreamWriter sw = new StreamWriter(fs, Encoding.UTF8))
According to chardet the encoding is: ISO-8859-2
.
A little hint in the right direction would be greatly appreciated...