0

I am receiving the below error with the dash character "-"

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 38: ordinal not in range(128)

I have tried using the following: skills.encode('utf-8') but I still get the error. Below is my code in which I am trying to write to csv.

 writer.writerow([name.encode('utf-8'),
                 heading.encode('utf-8'),
                 location.encode('utf-8'),
                 education.encode('utf-8'),
                 summary,
                 currentRole,
                 allRoles,
                 companiesFollowed,
                 groups,
                 skills.encode('utf-8')])
dcraven
  • 139
  • 4
  • 16
  • only you could say what that character is supposed to be. – Adam Smith Apr 29 '19 at 01:53
  • The character is the dash – dcraven Apr 29 '19 at 01:56
  • 1
    ah, not just a normal `-` dash, but an _n-dash_ https://www.fileformat.info/info/unicode/char/2013/index.htm. Yeah you can't represent that in ASCII. What would you like to be there instead? – Adam Smith Apr 29 '19 at 01:59
  • Just a regular dash if possible. I don't know the difference between a dash and an en dash. – dcraven Apr 29 '19 at 02:01
  • if you use Python3 that's what you'll get (and you can skip the `encode`s). If you don't, you'll have to learn a whole lot more than you know now about Unicode and what a Unicode code point means. I'm not aware of any function that can just _know_ that you want an n-dash to be a hyphen. You could make your own string replacement using `string.maketrans` and `str.translate` but that's about it. – Adam Smith Apr 29 '19 at 02:08
  • Possible duplicate of [Convert Unicode to ASCII without errors in Python](https://stackoverflow.com/questions/2365411/convert-unicode-to-ascii-without-errors-in-python) – accdias Apr 29 '19 at 02:11

1 Answers1

1

You can specify one of a number of settings to str.encode under the errors keyword. More info can be found in the docs but I'd recommend you use the 'replace' error handler.

writer.writerow([name.encode('utf-8', errors='replace'),
    heading.encode('utf-8', errors='replace'),
    location.encode('utf-8', errors='replace'),
    education.encode('utf-8', errors='replace'),
    summary,
    currentRole,
    allRoles,
    companiesFollowed,
    groups,
    skills.encode('utf-8', errors='replace')])

This will end up making a bytes object with a ? in place of each unencodable code point.

Adam Smith
  • 52,157
  • 12
  • 73
  • 112
  • Thanks for your response @Adam. That didnt work unfortunately I got this error : skills.encode('utf-8', errors='replace')]) UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 38: ordinal not in range(128) – dcraven Apr 29 '19 at 02:42
  • @dcraven I can't reproduce that result. What version of Python are you running? – Adam Smith Apr 29 '19 at 03:25