0

Im trying to write a response which is in json format to a csv file using csvwriter. below is the code:

import csv
import requests
import codecs


url = "xxxx"

data = requests.get(url).json()

with codecs.open("xxx.csv", 'w', encoding='utf-8') as csv_file:
    writer = csv.DictWriter(csv_file, fieldnames=['id', 'name', 'age', 'company', 'sex', 'job', 'time', 'main', 'sub', 'thor'])
    writer.writeheader()

    for row in data:
        writer.writerow(row)

This is the error i keep getting

Traceback (most recent call last):
  File "so_test.py", line 15, in <module>
    writer.writerow(row)
  File "/usr/lib64/python2.7/csv.py", line 148, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 8: ordinal not in range(128)

tried using io.open , open(file, 'wb', 'utf-8'). nothing worked. Can someone help?

jahan
  • 103
  • 4
  • 19
  • `csv` module in Python 2 doesn’t support encodings without work. Read the bottom of the module docs. It’s all fixed in Python 3. Python 2 is end of life. – Mark Tolonen Sep 01 '20 at 02:40

2 Answers2

0

Please use encoding='utf-8-sig' or encoding='latin-1'

Or use the following command. Add csv_file.write(u'\ufeff') like in my code.

with codecs.open("xxx.csv", 'w', encoding='utf-8') as csv_file:
    csv_file.write(u'\ufeff')
    writer = csv.DictWriter(csv_file, fieldnames=['id', 'name', 'age', 'company', 'sex', 'job', 'time', 'main', 'sub', 'thor'])
    writer.writeheader()

    for row in data:
        writer.writerow(row)
Ahmed Mamdouh
  • 696
  • 5
  • 12
0

This normally happens when there are some special non-printable characters in the data being written that are not in the standard ascii range - ex: special formatting characters in Word or any other tool (à, é, ü, etc)

Try removing any such character before writing to a file. A quick search in Stackoverflow suggests below options. This might result in some data loss. But will help you to compare and see what is the special character missing and whether it is really required.

Option 1:

import re
re.sub(r'[^\x00-\x7f]',r'', string_variable)

Option 2:

encoded_string = original_string.encode("ascii", "ignore")
decoded_string = encoded_string.decode()

Option 3: stackoverflow - using lambda function to remove non-ascii characters

SSS
  • 51
  • 6