Print each column of lists (contains unicode strings) into a fixed width respectively in Python

Question

I'd like to print a list of lists where the width of each string is fixed into the max length of its corresponding column. Here is what I have tried.

lists = [['abcde',      u'一二三四五'], 
         [u'六七八九十零',    'fghij']]

# calculate the maximum length of each column
column_max_len = [max(len(item) for item in t) for t in zip(*lists)] # [6, 5]

# format print
for row in lists:
    s = '\t'.join(['{value:<{width}}'.format(value=row[idx].encode('utf-8'), width=column_max_len[idx]) 
                    for idx, item in enumerate(row)])
    print(s)

The output is,

abcde   一二三四五
六七八九十零  fghij

My expected result is,

abcde       一二三四五
六七八九十零  fghij

Don't encode to UTF-8 before formatting, for starters. You have double-width Unicode codepoints, so aligning those is going to be.. tricky. See the duplicate. — Martijn Pieters, Sep 25 '16 at 15:12
Other posts on the subject: [Alignment of Wide east asian characters with format function](http://stackoverflow.com/q/21921869), [How to control padding of Unicode string containing east Asia characters](http://stackoverflow.com/q/4622357), [Programmatically tell if a Unicode character takes up more than one character space in a terminal](http://stackoverflow.com/q/7086856), etc. — Martijn Pieters, Sep 25 '16 at 15:14
@MartijnPieters, If removing `.encode('utf-8')`, an error is raised , i.e. `UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)`. — SparkAndShine, Sep 25 '16 at 15:14
You'll need to use *Unicode* string literals for the format string and the `\t` joiner: `u'\t'.join([u'{value:<{width}}'.format(...) ...])` — Martijn Pieters, Sep 25 '16 at 15:15
@MartijnPieters, thx a lot. The output is still the same, not pretty. — SparkAndShine, Sep 25 '16 at 15:20
@MartijnPieters, replace `len(item)` with `sum([1 if ord(c)<128 else 2 for c in item])` works for me. — SparkAndShine, Sep 25 '16 at 15:55
That's... a rather naive way of making it work. The majority of Unicode codepoints are *not* wide characters. I linked you to a post that lets you detect if a codepoint is wide, at the very least use that. — Martijn Pieters, Sep 25 '16 at 17:47

Print each column of lists (contains unicode strings) into a fixed width respectively in Python

0 Answers0