0

I am using a web server running python (3.6.9) and django. On the client I am using javascript to encode some information as a b64 string and send it to the server as a post request. Then I decode the b64 string on the server in python. However, python raises an error when decoding the string if it contains non english characters.

I've tried to encode and decode strings in python and javascript, and the b64 encoded string is different in python and javascript when the string contains non english characters. I assume the javascript encoding is correct because its able to decode it again without error, and it's the original string with non english characters. I need to produce this same behaviour in python so I'm able to correctly decode the b64 string (generated from javascript) on the server.

// javascript

// encode
btoa('abcú') // YWJj+g==

// decode
atob(btoa('abcú')) // abcú

# python
import base64
import json

# encode
a=base64.b64encode('abcú'.encode('utf-8')).decode('utf-8') # YWJjw7o=

# decode
a=base64.b64decode(a).decode('utf-8') # error raised 'UnicodeEncodeError: 'ascii' codec can't encode character '\xfa' in position 3: ordinal not in range(128)'
# print(a) # I want to print the original string 'abcú' here

Both translate the 'abc' to 'YWJj' but python translates 'ú' to 'w7o=' and javscript translates it to '+g=='

How can I make python correctly decode this string with the non english character?

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
zonzon510
  • 163
  • 13

1 Answers1

0

The javascript code is probably* using the latin-1 / ISO-8859-1 encoding:

>>> s = "YWJj+g=="
>>> import base64
>>> base64.b64decode(s)
b'abc\xfa'
>>> base64.b64decode(s).decode('latin-1')
'abcú'

*There are other 8-bit encodings, such as cp1252, which provide the same result, but latin-1 was the "universal" encoding on the web before the rise of UTF-8. It's worth noting that latin-1 only supports a limited range of non-ASCII characters; the answers to this question provide some information on using UTF-8 and base64 in the browser.

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153