17

When it comes to Chinese characters, I am unable to get the Front End of Mathematica to use the fonts of my choice. How can I get it to use the fonts I need?

Here I provide two screenshots to show the problem, one from Word (top), the other from Mathematica on WinXP, both displaying the same string. Note that Mathematica uses several different fonts (I guess it uses font substitution when the font it tries to use first doesn't contain a glyph---however the font I specified contains all glyphs I need!). Here I use the font Microsoft YaHei, which comes with Win7, but is downloadable for XP too.

EDIT: Here's some test code:

str = "肖诮陗俏削帩消峭捎绡莦弰悄焇琑逍㲖㲵䏴哨娋宵屑綃梢痟睄筲艄萷销䇌䘯趙揱旓硝稍踃輎矟䌃箾蛸誚榍蕱銷鞘潲碿糏霄䴛韒髾鮹鞩魈颵"

Style[str, Large, FontFamily -> "SimSun"]

(SimSun comes with XP and should contain all these characters too, although not sure if in all versions.)

EDIT 2: I am on Windows XP (with East Asian language support enabled), I wonder if the results are different on other OSs.

Fonts in Word Fonts in Mathematica


Summary: It appears that the behaviour depends on the particular OS and the fonts installed, and unfortunately there seems to be no way to make the fonts uniform (even if there exists a single font containing all the glyphs).

Szabolcs
  • 24,728
  • 9
  • 85
  • 174
  • Can you post the Unicode text for that string? – Mr.Wizard May 02 '11 at 13:02
  • @Mr.Wizard, yes, should've done that. – Szabolcs May 02 '11 at 13:07
  • Correction, I see this: http://i.imgur.com/zvA8o.gif – Mr.Wizard May 02 '11 at 14:33
  • @Mr.Wizard, what OS are you using? The font looks weirdly unhinted (OS X?), but it's correct. I'm using Windows XP. – Szabolcs May 02 '11 at 15:21
  • @Mr.Wizard, I also wonder, are you able to change the font style (apart from getting a consistent font throughout the string)? Can you please try SimSun and SimHei on Windows? These are available by default and are sufficiently different that you'll notice if font changing works. – Szabolcs May 02 '11 at 15:29
  • Under Win7-64 I get the same results as Szabolcs. – Sjoerd C. de Vries May 02 '11 at 19:25
  • I am using mma7 on Windows XP. I don't think I have SimSun or SimHei installed. It occurs to me that I used `Rasterize` rather than taking a screen capture. I'll try it with a screen cap. – Mr.Wizard May 03 '11 at 00:24
  • Here is a cropped screencap after installing SimSun (and Microsoft YaHei from before): http://i.imgur.com/MBSNE.png – Mr.Wizard May 03 '11 at 00:38
  • @Mr.Wizard, while the font style in your screenshot is uniform, that font is not SimSun (SimSun has variable width strokes). So something goes wrong on your machine as well. I guess there's nothing I can do about this problem, though it is strange that is only appears when some "more unusual" glyphs are used, and not with Roman letters. – Szabolcs May 05 '11 at 10:30
  • I agree it is not right. I wanted to show you what I saw, for reference. Sorry if I gave the impression that I thought it worked correctly on my installation. Having almost no experience with non-English alphabets, I will not even guess as the to cause of the problem(s). – Mr.Wizard May 05 '11 at 10:35
  • @Mr.Wizard, you didn't give the wrong impression, your comment was very useful. – Szabolcs May 05 '11 at 10:48
  • @Mr.Wizard, why did you use `Rasterize` rather than doing a screen capture? Do you have a convenient way to upload images to imgur directly from Mathematica? – Szabolcs May 05 '11 at 16:47
  • Szabolcs, I have not set up such a thing, but it should be possible. By habit I use Rasterize, right-click, Save Graphic As... and then upload that file. For me this is faster than "Print Screen", open graphics application, paste, select, crop, Save As..., and finally upload. If I devise a way to upload to imgur.com I will still have to create an image tag and paste the URL here, so I don't think it will save much time. – Mr.Wizard May 06 '11 at 04:34
  • Have you directly contacted anyone at WRI? I am quite curious to know the reply. – Mr.Wizard May 10 '11 at 12:45

3 Answers3

4

Font family names are not equivalent to system font files names. You can read those font names by the way below.

This is the easy way to get the right font family name is

  • Firstly, type some text, e.g."我们", edit to your "target font" using the Mathematica menu.
  • Secondly, add //InputForm. After running the cell, you will get the right font name. In my computer, the fontfamily name for "楷体" is "¿ [Not]Ìå_GB2312". Awesome.

image

Verbeia
  • 4,400
  • 2
  • 23
  • 44
nanohc
  • 41
  • 1
4

It could be mathematica is replacing your Font-Family setting with a neighbouring font. Running

Options[$FrontEnd, FontSubstitutions]

will give you the replacement list mathematica uses.

Phil
  • 1,110
  • 1
  • 9
  • 25
  • This could very well be the solution to the problem. I need more time to figure out how to change it though. – Szabolcs Jun 09 '11 at 13:09
  • Clearing this option does change the fonts used, but there are still several different fonts for different characters. – Szabolcs Jun 09 '11 at 13:13
  • The other reason could be that Mathematica is displaying the characters with a fixed-width. That can be altered with the `FontTracking` option. – Phil Jun 09 '11 at 13:33
  • Look at the substitutions for DefaultKanjiFont, DefaultMonoKanjiFont, DefaultChineseSimplifiedFont, etc. Mathematica's Unicode font support was implemented in a day and age when barely anybody else (including most OS's) did Unicode in any useful way, and so any font in the appropriate Far Eastern encoding ranges were implemented as a placeholder font name which was substituted via `FontSubstitutions` (which does, indeed, have platform-dependent settings). Archaic now, and should be fixed up at some point (this isn't the first time I've heard this complaint). – John Fultz Dec 02 '11 at 07:56
  • @JohnFultz Thank you for the suggestions. Even when setting all those options to the same font, different characters are shown in different fonts. But this isn't very limiting for me anymore. A note about Simplified/Traditional/Kanji: generally it's not possible to tell if a character is Chinese or Japanese based on its Unicode code point because they are likely to be used in both languages. The reason there exist different fonts for these is that some characters tend to be written in one or another style in the different languages, even though they share a code points. Try e.g. 直 ... – Szabolcs Dec 02 '11 at 10:47
  • ... or 令 in a Japanese (Meiryo) or Simplified Chinese (Microsoft YaHei) in a Word processor. They will look quite different. Since it's usually not possible to tell which language's font should be used for a given characters, the decision is usually made based on system settings, which appears to be Japanese for Western-language systems (at least on Windows). See also http://en.wikipedia.org/wiki/Unihan#Examples_of_language_dependent_characters (works in Firefox on Windows, but not e.g. in anything Webkit based or on Linux). – Szabolcs Dec 02 '11 at 10:51
1

For your first edit, some SimSun font may not cover enough CJK range (some of the characters in str belong to CJK Unified Ideographs Extension A). There is a wonderful site summarized the covering ranges of many East Asian Unicode fonts.

For your second edit, I think maybe you would like to use

"\[CapitalIHat]\[Cent]\[CapitalEGrave]\[IAcute]\[CapitalNTilde]\[CapitalARing]\.ba\[CapitalUAcute]"

as the FontFamily, which is actually a Unicode representation of the fontname's ChineseSimplified version:

FromCharacterCode[ToCharacterCode["微软雅黑", "CP936"]]

It works fine for on my Windows 7 English version.

And I think, at least for English version Mathematica, the FrontEnd always tries to interpret a CJK character as a Japanese character first, if failed (which means the character does not appear in the # Japanese section of UnicodeLanguageFontMapping.tr), lookup the # Chinese simplified section. Then the default fonts for different languages are defined in UnicodeFontMapping.tr(there is even a Klingon entry LOL), which links to @JohnFultz 's suggestion.

Silvia
  • 290
  • 1
  • 13
  • I have the Win7 version of SimSun installed here, and I am sure it contains all these characters as MS Word displays them in a unifrom style when using this font. The rest of your suggestions I'll need to play with. Yes, it seems to be true for most software (including Windows itself) that Western language versions always try a Japanese font first. I have modified my WinXP to look for Simplified Chinese first, so I'd get the version of e.g. 直 that I like. – Szabolcs Jan 08 '12 at 09:38
  • @Szabolcs In that case maybe the "Font name"/"Font family" of the SimSun font is not "SimSun". I have three simsun files, whoes "Font name"/"Font Family"s are separately "宋体", "新宋体"(both of which cover the *CJK Unified Ideographs* and *CJK Unified Ideographs Extension A*) and "SimSun-ExtB"(which covers the *CJK Unified Ideographs Extension B* only). Maybe use software like Babel Map ( http://www.babelstone.co.uk/software/babelmap.html ) to check the exact range of the font. – Silvia Jan 08 '12 at 10:36
  • and I found that in the **UnicodeLanguageFontMapping.tr** file, many common characters for both Japanese and Simplified Chinese are placed only in the **# Japanese** section. I tried to move those ones belonging to *CJK Unified Ideographs* and *CJK Unified Ideographs Extension A* to the **# Chinese simplified** section, with changing the encode from ShiftJIS to CP936, and found those characters are automatically recognized as Chinese characters and then assigned with DefaultChineseSimplifiedFont/DefaultMonoChineseSimplifiedFont now. – Silvia Jan 08 '12 at 11:41