0

I would like to convert Unicode characters encoded by 'utf-8' to hex string of which character starts with '%' because the API server only recognize this form.

For example, if I need to input unicode character '※' to the server, I should convert this character to string '%E2%80%BB'. (It does not matter whether a character is upper or lower.)

I found the way to convert unicode character to bytes and convert bytes to hex string in https://stackoverflow.com/a/35599781.

>>> print('※'.encode('utf-8'))
b'\xe2\x80\xbb'
>>> print('※'.encode('utf-8').hex())
e280bb

But I need the form of starting with '%' like '%E2%80%BB' or %e2%80%bb'

Are there any concise way to implement this? Or do I need to make the function to add '%' to each hex character?

Community
  • 1
  • 1
dolgom
  • 611
  • 1
  • 11
  • 27

1 Answers1

0

There is two ways to do this:

The preferred solution. Use urllib.parse.urlencode and specify multiple parameters and encode all at once:

urllib.parse.urlencode({'parameter': '※', 'test': 'True'})
# parameter=%E2%80%BB&test=True

Or, you can manually convert this into chunks of two symbols, then join with the % symbol:

def encode_symbol(symbol):
    symbol_hex = symbol.encode('utf-8').hex()
    symbol_groups = [symbol_hex[i:i + 2].upper() for i in range(0, len(symbol_hex), 2)]
    symbol_groups.insert(0, '')
    return '%'.join(symbol_groups)

encode_symbol('※')
# %E2%80%BB
m0nhawk
  • 22,980
  • 9
  • 45
  • 73