0

I have a column consisting of English and Japanese characters and I need to print the column in right justified manner.

This is the column I am supposed to print :

column = ["通常残業時間", "bbbbbbbbb", "tttt"]

Normal way would be to get maximum length of string and adjust accordingly but the problem is that text is in japanese as well and width of a japanese character is more than that of an english one. How should I compare the string lengths in this case and print accordingly?

This is the required output :

通常残業時間
 bbbbbbbbb
      tttt

I am working in Python3.

Jarvis
  • 8,494
  • 3
  • 27
  • 58

3 Answers3

0

The problem is the width of a japanese character is a little bit wider than a English character and also the space.' '

There are solutions for this situation. You just need to calculate the width of these two kinds of language.

columns = ["通常残業時間", "bbbbbbbbb", "tttt"]
for i in column:
    print('|'.join(list(i)))

You could get some things like that.

通|常|残|業|時|間
b|b|b|b|b|b|b|b|b
t|t|t|t

You could use the | to find the relationship of width. Here I think, it nearly looks like that 5 Japanese equal 9 English character(Don't forget the minus the |.)

When you get the width relationship.

Then I think you might know how to calculate the length they should fit.


Sorry for above wrong or misleading advice. I realized you can't make it align unless you find a different width of space to fit different language chars.

But I think I might found some relative question about this and some useful packge.

Display width of unicode strings in Python [duplicate]

kitchen.text.display.textual_width It just for python2.7 sadly...

tianhua liao
  • 655
  • 5
  • 9
  • I am afraid this answer doesn't help much. Can you clarify a bit more on the last part of your answer? – Jarvis Oct 05 '18 at 06:07
0

It seems that kanji (and chinese) letters are twice the length of ascii.

So, I will use .encode('ascii'), and UnicodeEncodeError to check if a line is ascii or not. (Based on answer here: https://stackoverflow.com/a/196391/837627)

If it is ascii, we will need more spaces in front of the line.

Here is a solution:

words = ["hhhh", "你你你你你你"]
max_length = 0

# Find the max length string in the array
# For kanji strings, the max length is doubled
for line in words:
    line_length = 0
    try:
        line.encode('ascii')
    except UnicodeEncodeError:
        line_length = 2 * len(line)
    else:
        line_length = len(line)
    if max_length < line_length:
        max_length = line_length

# Find the number of spaces to add by subtracting max line length by length of current line
# If current line is kanji, it is twice a ascii string length
for line in words:
    space = 0
    try:
        line.encode('ascii')
    except UnicodeEncodeError:
        space = max_length - (len(line)*2)
    else:
        space = max_length - len(line)
    print((' ')*space + line)

Output:

        hhhh
你你你你你你

The first line is 4 ascii characters long. The second line is 6 chinese characters long == 12 ascii characters long. So 12-4=8 spaces needed in front of the first line (MONOSPACE!!!). It does not looking correct in StackOverflow, but in terminal it will be aligned due to monospace font.

Btw, I used Python3 to write this solution.

bunbun
  • 2,595
  • 3
  • 34
  • 52
0

You can use r.just on the last two items of column

column = ["通常残業時間", "bbbbbbbbb", "tttt"]

for idx, item in enumerate(column):
    if not idx:
        print(item)
    else:
        print(item.rjust(12))
通常残業時間
 bbbbbbbbb
      tttt
vash_the_stampede
  • 4,590
  • 1
  • 8
  • 20