6

I found this on StackOverflow: Is there a way to programmatically determine if a font file has a specific Unicode Glyph?

However, I need to also check UTF-32 characters. What I am trying to do is parse the Unihan data over at unicode.org and ignore all characters that are not supported by "Arial Unicode MS". I started work on the CheckIfCharInFont() method by modifying the arg to take a string (for utf-32) and checking if it is a surrogare pair. I then get the Int32 by doing a char.ConvertToUtf32(surrogate1, surrogate2), but the problem is the current CheckIfCharInFont() method only supports Uint16... you can see where I placed a "Debugger.Break" that this is a bit of a problem. Any experts out there that can help me figure this thing out?

Thanks

public static bool CheckIfCharInFont(string character, Font font)
        {
            UInt16 value = 0;
            int temp = 0;
            if (character.Length > 1) //UTF-32
            {
                temp = char.ConvertToUtf32(character[0], character[1]);
            }
            else
            {
                temp = (int)character[0];
            }

            if (temp > UInt16.MaxValue)
            {
                Debugger.Break();
                return false;
            }
            value = (UInt16)temp;

            //UInt16 value = Convert.ToUInt16(character);
            List<FontRange> ranges = GetUnicodeRangesForFont(font);
            bool isCharacterPresent = false;
            foreach (FontRange range in ranges)
            {
                if (value >= range.Low && value <= range.High)
                {
                    isCharacterPresent = true;
                    break;
                }
            }
            return isCharacterPresent;
        }
Community
  • 1
  • 1
Matt
  • 6,787
  • 11
  • 65
  • 112
  • 1
    According to the docs (http://msdn.microsoft.com/en-us/library/dd144956%28v=vs.85%29.aspx), the GLYPHSET structure that's returned from GetFontUnicodeRanges contains 16-bit values (unless you ask for 8-bit values instead). Looks like this API simply isn't capable of telling you about Unicode code points above U+FFFF. I don't know whether there's another way or not. – Joe White Mar 11 '11 at 14:49
  • 1
    Well, that just blows. There must be some way of listing ALL the codepoints in a font file (or something similar), then I could run a match against what I have against what is in the font file and ignore any characters that are missing from the font. Guess I'll just have to keep looking around. No luck so far though... :-( – Matt Mar 12 '11 at 06:44

1 Answers1

2

You would have to do your own thing. Maybe starting from the specs (http://www.microsoft.com/typography/otspec/cmap.htm)

It is what this tool does: http://mihai-nita.net/2007/09/08/charmapex-some-kind-of-character-map/

Mihai Nita
  • 5,547
  • 27
  • 27