How can I print all unicode characters?

Question

I want to print some unicode characters but u'\u1000' up to u'\u1099'. This doesn't work:

for i in range(1000,1100):
    s=unicode('u'+str(i))
    print i,s

Three of the answers here are functionally identical, posted within minutes of each other. Also, while I know this asks for unicode in a certain range, in case anyone came here looking to print the full range, this functionally identical answer [over here](https://stackoverflow.com/a/33043329/1397555) (and a duplicate question too) gives that. — Alex Hall, Aug 10 '17 at 17:02

score 18 · Answer 1 · answered Oct 31 '11 at 21:12

18

You'll want to use the unichr() builtin function:

for i in range(1000,1100):
    print i, unichr(i)

Note that in Python 3, just chr() will suffice.

answered Oct 31 '11 at 21:12

Sanqui

196
4

score 12 · Accepted Answer · answered Oct 31 '11 at 21:06

12

Use unichr:

s = unichr(i)

From the documentation:

unichr(i)

Return the Unicode string of one character whose Unicode code is the integer i. For example, unichr(97) returns the string u'a'.

answered Oct 31 '11 at 21:06

Mark Byers

811,555
193
1,581
1,452

score 7 · Answer 3 · answered Oct 31 '11 at 21:07

7

Try the following:

for i in range(1000, 1100):
    print i, unichr(i)

answered Oct 31 '11 at 21:07

Andrew Clark

202,379
35
273
306

3

Just for fun, here is how awful it is to try and do this without `unichr()`: `print i, eval(r"u'\u" + hex(i)[2:].rjust(4, '0') + "'")` – Andrew Clark Oct 31 '11 at 21:16
1

@FJ: `eval(r"u'\u%04x'" % i)` – John Machin Oct 31 '11 at 21:34
2

Using string substitution and eval is not a good recommendation for a beginner. Pointing the OP to a built-in function specifically designed for this purpose was the correct and ideal answer. – Raymond Hettinger Oct 31 '11 at 21:46

score 6 · Answer 4 · answered Oct 31 '11 at 21:07

6

unichr is the function you are looking for - it takes a number and returns the Unicode character for that point.

for i in range(1000, 1100):
    print i, unichr(i)

answered Oct 31 '11 at 21:07

Sean Vieira

155,703
32
311
293

Bruno Degomme · Answer 5 · 2022-05-11T08:42:17.263

3

(Python 3) The following will give you the characters corresponding to an arbitrary unicode range

start_code, stop_code = '4E00', '9FFF'  # (CJK Unified Ideographs)
start_idx, stop_idx = [int(code, 16) for code in (start_code, stop_code)]  # from hexadecimal to unicode code point
characters = []
for unicode_idx in range(start_idx, stop_idx+1):
    characters.append(chr(unicode_idx))

edited May 11 '22 at 08:42

answered Oct 02 '19 at 11:08

Bruno Degomme

883
10
11

score 0 · Answer 6 · answered Apr 22 '22 at 15:06

0

Use chr instead of unichr to avoid an error message.

for i in range(1000, 1100):
    print i, chr(i)

answered Apr 22 '22 at 15:06

Eduardo Freitas

941
8
6

score 0 · Answer 7 · answered Feb 07 '23 at 21:36

I stumbled across this rather old post and played a bit ...

Here you find the Unicode blocks:
https://en.wikipedia.org/wiki/Unicode_block

And here I am printing some of the blocks

#!/usr/bin/env python3

ranges = list()

# Just some example ranges ... 
# Plane 0 0000–ffff - Basic Multilingual Plane
ranges.append((0x0000, 0x001f, 'ASCII (Controls)'))
ranges.append((0x0020, 0x007f, 'ASCII'))
ranges.append((0x0100, 0x017f, 'Latin Extended-A'))
ranges.append((0x0180, 0x024f, 'Latin Extended-B'))
ranges.append((0x0250, 0x02af, 'IPA Extensions'))
ranges.append((0x0370, 0x03FF, 'Greek'))
ranges.append((0x4e00, 0x9fff, 'CJK Unified Ideographs')) 

# Plane 1 10000–1ffff - Supplementary Multilingual Plane
ranges.append((0x1f600, 0x1f64f, 'Emoticons'))
ranges.append((0x17000, 0x187ff, 'Tangut'))

for r in ranges:
    # print the header of each range
    print(f'{r[0]:x} - {r[1]:x} {r[2]}')
    j = 1
    for i in range(r[0], r[1]):
        if j % 80 == 0:
            print('')
        j += 1

        print(f'{str(chr(i))}', end='')
    print('\n')

NVRM · Answer 8 · 2023-02-10T16:36:45.020

-2

One might appreciate this php-cli version:

It is using html entities and UTF8 decoding.

Recent version of XTERM and others terminals supports unicode chars pretty nicely :)

php -r 'for ($x = 0; $x < 255000; $x++) {echo html_entity_decode("&#".$x.";",ENT_NOQUOTES,"UTF-8");}'

edited Feb 10 '23 at 16:36

answered Oct 19 '19 at 21:09

NVRM

11,480
1
88
87

How can I print all unicode characters?

8 Answers8

Linked