I created a function(parse_html(param)
) that returns a list like below,
list = [u'John', u'Muchia', u'Prozessoptimierung Fahrwiderst\xe4nde']
if I return print list[2]
, and in my function, it gives me Prozessoptimierung Fahrwiderstände
which is perfect, but it appears differently when in a list
The problem lies when I return the whole list return list
I want to avoid the 'u'. I want to store a list of strings and the Unicode characters like ä ö and ü should also appear.
fname[x] is the source of the HTML file where x is the file number which is incremented from 0 to count(file_number)
list=[]
newlist=[]
list = parse_html(fname[7])
for row in list:
drow = row.encode('utf-8')
newlist.append(drow)
print newlist
The goal is to save the returned list to a CSV file. Everytime a new file(fname) is selected, the list is created and should add the new list to the csv file previously created.
I am doing something really wrong and I can realize that and my head hurts. Please help.
update:
for x in range(0,count):
list = parse_html(fname[x])
with open('output.csv', 'wb') as myfile:
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
wr.writerow(list)
error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 132: ordinal not in range(1
28)
Answer:
wr.writerow([c.encode('utf-8') for c in list]) # instead `wr.writerow(list)