0

I am trying to create a random unicode generator and made a function that can create 16bit unicode charaters. This is my code:

import random
import string

def rand_unicode():
    list = []

    list.append(str(random.randint(0,1)))
    for i in range(0,3):
        if random.randint(0,1):
             list.append(string.ascii_letters[random.randint(0, \
             len(string.ascii_letters))-1].upper())
        else: 
            list.append(str(random.randint(0,9)))

    return ''.join(list)


print(rand_unicode())

The problem is that whenever I try to add a '\u' in the print statement, Python gives me the following error:

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape

I tried raw strings but that only gives me output like '\u0070' without turning it into a unicode character. How can I properly connect the strings to create a unicode character? Any help is appreciated.

Mahir Islam
  • 1,941
  • 2
  • 12
  • 33
Another_coder
  • 728
  • 1
  • 9
  • 23
  • Possible duplicate of ["Unicode Error "unicodeescape" codec can't decode bytes... Cannot open text files in Python 3](https://stackoverflow.com/questions/1347791/unicode-error-unicodeescape-codec-cant-decode-bytes-cannot-open-text-file) – shuberman Aug 11 '19 at 10:39
  • 1
    I saw that one, but it is nearly a decade old with a much older version of python, and the answers there only partially answered this problem. Also, I am using a mac. – Another_coder Aug 11 '19 at 10:58
  • tried the 'u' flag? maybe find something helpful [here](https://stackoverflow.com/questions/2081640/what-exactly-do-u-and-r-string-flags-do-and-what-are-raw-string-literals) – FObersteiner Aug 11 '19 at 10:59
  • Possible duplicate of [Process escape sequences in a string in Python](https://stackoverflow.com/questions/4020539/process-escape-sequences-in-a-string-in-python) – Joe Aug 11 '19 at 11:57
  • \u0070 is a single character, just like \n is a single character. Concatenation \u with 0070 does not mean what you think it means. Creating the escape sequence literal to create a character is a waste. – MisterMiyagi Aug 11 '19 at 12:07

2 Answers2

0

From:

The problem is that whenever I try to add a '\u' in the print statement, Python gives me the following error:

it sounds like the problem may be in code you haven't included in your question:

print('\u' + rand_unicode())

This won't do what you expect, because the '\u' is interpreted before the strings are concatenated. See Process escape sequences in a string in Python and try:

print(bytes('\\u' + rand_unicode(), 'us-ascii').decode('unicode_escape'))
Joe
  • 29,416
  • 12
  • 68
  • 88
0

A unicode escape sequence such as \u0070 is a single character. It is not the concatenation of \u and the ordinal.

>>> '\u0070' == 'p'
True
>>> '\u0070' == (r'\u' + '0070')
False

To convert an ordinal to a unicode character, you can pass the numerical ordinal to the chr builtin function. Use int(literal, 16) to convert a hex-literal ordinal to a numerical one:

>>> ordinal = '0070'
>>> chr(int(ordinal, 16))  # convert literal to number to unicode
'p'
>>> chr(int(rand_unicode(), 16))
'ᚈ'

Note that creating a literal ordinal is not required. You can directly create the numerical ordinal:

>>> chr(112)  # convert decimal number to unicode
'p'
>>> chr(0x0070)  # convert hexadecimal number to unicode
'p'
>>> chr(random.randint(0, 0x10FFF))
'嚟'
MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119