I have written the following python code that should convert a file to UTF8. It works well but I noticed that if the file is too big (in this case we are talking of 10GB of file!) the program crashes!
In general it seems that it takes too much time: 9minutes to convert a 2GB of text files: maybe I can make it more efficient? I think it's because I'm first reading the whole file and then save it, could be that?
import sys
import codecs
filename= sys.argv[1]
with codecs.open(filename, 'r', encoding='iso-8859-1') as f:
text = f.read()
with codecs.open(filename, 'w', encoding='utf8') as f:
f.write(text)