7

On Android, I want to be able to detect if the font used can display a certain character or not, but as I understand it this is not possible with conventional means as indicated by Check if custom font can display character

To detect this I'm writing the character I want to check to a bitmap and then I write another character that I know is missing to another bitmap and compare the content of the bitmaps. If they are equal the character is missing.

The question is, is there any unicode character whose glyph is (more or less) guaranteed to be missing on fonts typically used on Android phones?

The Unicode replacement character sounds promising when reading about it on Wikipedia:

It is used to indicate problems when a system is not able to render a stream of data to a correct symbol. It is most commonly seen when a font does not contain a character, but is also seen when the data is invalid and does not match any character

However after doing a bit of testing I see that this character is not used to represent missing glyphs on either my Windows 7 computer or the Android phone I've tested with (Motorola Atrix).

Thomas Tempelmann
  • 11,045
  • 8
  • 74
  • 149
nibarius
  • 4,007
  • 2
  • 38
  • 56
  • Unicode contains a repertoire of more than 110,000 characters and has a limit of 1,114,112 code points. So in the unlikely case that a font has all glyphs (supports all writing systems, all languages), only 10% of the available code points are used. The rest is empty. What if you render whitespace glyph? Do you now about the 'missing character glyph'? – allcaps Mar 18 '14 at 12:48
  • I wasn't aware of 'missing character glyph', some Googling suggests that U+0000 can/should be used for missing characters in the font. However in at least one font I've tested with U+0000 is rendered as whitespace while missing characters are rendered as squares (similar to U+25A1). I guess my best bet is to use some reserved/unassigned unicode character instead. – nibarius Mar 19 '14 at 08:05
  • U+0000 is usually used to mark the end of a string. You need .notdef, unicode value undefined: http://www.microsoft.com/typography/otspec/recom.htm Characters are assigned in blocks of the same kind. Most blocks have some unassigned points at the end to start the next block on a round number. These points allow Unicode Consortium to add new glyphs to a block. New glyphs don't come into existence often. See http://typophile.com/node/102205. Maybe you can ask your question in the Typophile forum. They can tell you more about how this exactly works and how to render .notdef – allcaps Mar 19 '14 at 11:55
  • Thanks for the "Recommendations for OpenType Fonts" link, that was useful for me. It seems like I confused the glyph id 0 with unicode code point U+0000. For what I'm trying to do using one of the reserved code points should be good enough (see my own answer). – nibarius Mar 25 '14 at 13:45
  • Yes, it's a duplicate. Slightly different reason for wanting to detect missing glyphs (automatic detection vs manual detection by the user), but the actual question is the same. – nibarius Nov 14 '15 at 22:20

1 Answers1

4

There isn't any designated Unicode value for the glyph that is used to render glyphs that are missing in the font used. In the actual font, glyph id 0 should always be the .notdef glyph which is used for all characters that are missing a glyph. However it is not possible this information from the fonts on Android, so it's not possible to use the .notdef glyph directly.

In Unicode there are many reserved/unassigned code points and my limited testing indicate that these code points are rendered using the .notdef glyph. So by using U+0978, which is a reserved code point in the middle of the Devanagari block, I can detect if some other valid, known character exists in the font I want to test.

This is not a future proof solution since new glyphs may be added to reserved code points by the Unicode Consortium in the future. But for my needs it's good enough since what I want to do is a temporary thing that is not relevant any more in the near future.

Update:

The solution to look at U+0978 did not work long. That character was added in the Unicode 7.0 release in June 2014. Another option is to use a glyph that exists in unicode but that is very unlikely to be used in a normal font.

U+124AB in the Early Dynastic Cuneiform block is probably something that doesn't exist in many fonts at all.

nibarius
  • 4,007
  • 2
  • 38
  • 56
  • 7
    Technically, U+25A1 is a popular choice since it has the comment "may be used to represent a missing ideograph" -- Alternatively U+20DE is also used. This comes directly from the current Unicode specifications: http://www.unicode.org/charts/PDF/U25A0.pdf – Michaelangel007 Sep 30 '15 at 15:26
  • That's useful to know. However it couldn't be used in my particular case since they only "may" be used for that purpose. In for example the Roboto font both U+25A1 and U+20DE is missing a glyph. – nibarius Oct 02 '15 at 06:01
  • "Another option is to use a glyph that exists in unicode but that is very unlikely to be used in a normal font." - May not work with Noto (which tries to provide glyphs for every character in Unicode). However, you could probably use a private use area code point. There are a ton of those and Unicode will never encode "real" characters there. – Kevin Dec 11 '18 at 18:26