0

I am trying to parse csv that contains the text below text (test.csv)

"legalgroup_text"   "Aktiebolag"    "Aktiebolag"    "LGAKTIEBOLAG"
"legalgroup_text"   "Allmän försäkringskassa"   "Allmän försäkringskassa"   "LGALLMAENFOERSAEKRINGSKASSA"

I am using encoding iso-8859-1, since the file contains swedish character

import codecs
import csv

with codecs.open('test.csv', encoding='iso-8859-1') as label_file:
    data = csv.reader(label_file, delimiter='\t')
    for row in data:
        print(row)

I am getting error

Traceback (most recent call last):
File "/mnt/ashraful/PycharmProjects/Test/test.py", line 6, in <module>
    for row in data:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 23-24: ordinal not in range(128)

I also tried with encoding utf-8, but getting error

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 25: ordinal not in range(128)
Ashraful Islam
  • 12,470
  • 3
  • 32
  • 53

1 Answers1

0

Give up on Python 2 and use Python 3. That by itself (no other changes, I just tested) will fix the issue.

tel
  • 13,005
  • 2
  • 44
  • 62