8
from prettytable import PrettyTable

header="乘客姓名,性别,出生日期".split(",")
x = PrettyTable(header)
x.align["乘客姓名"]="l"
table='''HuangTianhui,男,1948/05/28
姜翠云,女,1952/03/27
李红晶,女,1994/12/09
LuiChing,女,1969/08/02
宋飞飞,男,1982/03/01
唐旭东,男,1983/08/03
YangJiabao,女,1988/08/25
买买提江·阿布拉,男,1979/07/10
安文兰,女,1949/10/20
胡偲婠(婴儿),女,2011/02/25
(有待确定姓名),男,1985/07/20
'''
data=[row for row in table.split("\n") if row]
for row in data:
    x.add_row(row.strip().split(","))

print(x)

enter image description here

What I want the output format is as the following.

enter image description here

In this example, prettytable.py can not display properly chinese ambiguous width of character · in 买买提江·阿布拉 , the character has ambiguous width. How to fix the bug in prettytable.py?

I have add two lines in def _char_block_width(char) of prettytable.py, but the problem still remains.

if char == 0xb7:
    return 2 

I have solved it, the file prettytable.py should be installed in my computer d:\python33\Lib\site-packagesdirectly not in as the form of d:\python33\Lib\site-packages\prettytable\prettytable.py

There are many chinese character with ambiguous width, it is stupid for us to add two lines such as the following to fix the bug, if there are 50 ambiguous character,100 lines will be added in the prettytable.py, is there a simple way to do that? Just fix some lines to treat all the ambiguous character?

if char == 0xb7:
    return 2 
mkj
  • 2,761
  • 5
  • 24
  • 28
showkey
  • 482
  • 42
  • 140
  • 295
  • 1
    this may help http://stackoverflow.com/questions/4622357/how-to-control-padding-of-unicode-string-containing-east-asia-characters – icedtrees Apr 12 '14 at 07:37
  • i have read the post ,a set of full width versions of the printable ASCII characters to be used is not a good idea,i found that there is no such problem in R to display all kinds of characters ,python need learn from R to create fine display.Now , i want to know how R do ? – showkey Apr 12 '14 at 08:01

1 Answers1

5

The issue you're running into has to do with the dot character in the incorrectly padded line of your Python output. The dot is Unicode code point U+00B7 · middle dot. This character is considered to have an "ambiguous" width, as it is a narrow character in most non-East-Asian fonts, but is rendered a full-width in most Asian ones. Without context, a program can't tell how wide it will appear on the screen. Unfortunately, Python's Unicode system doesn't appear to have any way to provide that context.

One fix might be to replace the offending dot with one that has an unambiguous width, such as U+30FB katakana middle dot (which is always full width). This way the padding logic will be able to recognize that extra space is needed for that line.

Another solution could be to set your console to use a font with more Western treatment of the middle dot character, rather than the current one that follows the East-Asian style of rendering of it as full-width. This will mean that the existing padding is correct. Your output from R clearly uses a different font that the Python output does, and its font renders the dot as half-width.

Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • who can help me to revise the bug in the prettytable source code?i have tried to understand the source code,it is difficult for me to master . – showkey Apr 12 '14 at 11:34
  • The relevant part of the code is [here](https://code.google.com/p/prettytable/source/browse/trunk/prettytable.py#1473). The question is how to handle the ambiguity. If you just need a crude fix for your own use, you could simply add in `0xB7` as an extra case that gets treated as width 2, but that probably won't be something the upstream folks will care for. A better solution would be to pass some extra parameter to the width function telling it if you're in an East-Asian context or not, but that would require a bit more work to get set up. – Blckknght Apr 12 '14 at 17:27
  • I have add two lines in def _char_block_width(char):but problem still remain .if char == 0xb7: return 2 – showkey Apr 28 '14 at 13:19
  • @it_is_a_literature: When I add the two lines you refer to into my copy of the `prettytable` module, I get correctly spaced output. As long as you put the new lines above the `# Take a guess` comment and the following `return` statement, it should work. – Blckknght Apr 29 '14 at 02:42
  • I have download the file and add the two lines in _char_block_width,now how can i install it ?I create a new dir in `D:\python33\Lib\site-packages\prettytable` ,and save the file as prettytable.py in `D:\python33\Lib\site-packages\prettytable`,when i run >>> from prettytable import PrettyTable ,the error is: Traceback (most recent call last): File "", line 1, in ImportError: cannot import name PrettyTable – showkey Apr 29 '14 at 03:12
  • @it_is_a_literature: Um, you shouldn't need to download it anew, you clearly already have a copy on your system, since you were able to do `from prettytable import PrettyTable`. Just track down your existing copy and edit it. You can find its location by doing `import prettytable; print(prettytable.__file__)`. – Blckknght Apr 29 '14 at 03:15
  • >>> print(prettytable.__file__) D:\python33\lib\site-packages\prettytable-0.7.2-py3.3.egg\prettytable.py ,why i can not find the file prettytable.py in prettytable-0.7.2-py3.3.egg ? how can i change it? – showkey Apr 29 '14 at 05:29
  • when i reinstall it ,and import prettytable ,ok.i got errors in `>>> from prettytable import PrettyTable Traceback (most recent call last): File "", line 1, in ImportError: cannot import name PrettyTable` – showkey Apr 29 '14 at 06:02
  • i have solved it ,the file prettytable.py should be installed in my computer `d:\python33\Lib\site-packages\` directly not in as the form of `d:\python33\Lib\site-packages\prettytable\prettytable.py` – showkey Apr 29 '14 at 06:15