6

My question is similar to this one, but a little step-forward.

In my Win32 program I have some menu button with Unicode characters above BMP, such as U+1F5A4 (UTF-16 surrogate pairs 0xD83D 0xDDA4).
In Windows 10 the system font Segoe UI doesn't have this glyph: it is automagically replaced with a glyph from the font Segoe UI Symbol and displayed correctly in the button, thanks to a process called font linking (or font fallback, still not clear to me).
But in Windows 7 the font linking brings to a font that doesn't have this glyph neither, and the surrogate pairs appear as two empty boxes ▯▯. The same in Windows XP with Tahoma font.

I want to avoid these replacement boxes, by parsing the text before or after the assignment to the button, and replacing the missing glyph with some common ASCII character.

I tried GetGlyphOutline, ScriptGetCMap, GetFontUnicodeRanges and GetGlyphIndices but they don't support surrogate pairs.
I also tried GetCharacterPlacement and Uniscribe ScriptItemize+ScriptShape that support surrogate pairs, but all these functions search only into the base font of HDC (Segoe UI), they don't search for eventually fallback font (Segoe UI Symbol), which is the one that provides the glyph.

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink it's a place where I looked, but I really think it's not there the system takes the fonts to link to.

The question is: how can I know if the system font-linking produces the correct glyph or tofu boxes instead?


Edit

I found some kind of solution copying trom this code and adding the last GetCharacterPlacement.

#include <usp10.h>

wchar_t *checkGlyphExist( HWND hwnd, wchar_t *sUnicode, wchar_t *sLimited ) {

    // Create metafile
    HDC hdc = GetDC( hwnd );
    HDC metaFileDC = CreateEnhMetaFile( hdc, NULL, NULL, NULL );

    // Select menu font
    NONCLIENTMETRICSW ncm;
    ncm.cbSize = sizeof(ncm);
    SystemParametersInfoW( SPI_GETNONCLIENTMETRICS, ncm.cbSize, &ncm, 0 );
    HFONT hFont = CreateFontIndirectW( &(ncm.lfMenuFont) );
    SelectObject( metaFileDC, hFont );
    wprintf( L"%s\n", ncm.lfMenuFont.lfFaceName );  // 'Segoe UI' in Win 10 and 7 (ok)
                                                    // 'Tahoma' in Win XP (ok)

    // Use the meta file to intercept the fallback font chosen by Uniscribe
    SCRIPT_STRING_ANALYSIS ssa;
    ScriptStringAnalyse( metaFileDC, sUnicode, wcslen(sUnicode), 0, -1,
                      SSA_METAFILE | SSA_FALLBACK | SSA_GLYPHS | SSA_LINK,  
                      0, NULL, NULL, NULL, NULL, NULL, &ssa );
    ScriptStringFree( &ssa );
    HENHMETAFILE metaFile = CloseEnhMetaFile(metaFileDC);
    LOGFONTW logFont = {0};
    EnumEnhMetaFile( 0, metaFile, metaFileEnumProc, &logFont, NULL );
    DeleteEnhMetaFile( metaFile );
    wprintf( L"%s\n", logFont.lfFaceName );
        // 'Segoe UI Symbol' in Win 10 (ok)
        // 'Microsoft Sans Serif' in Win 7 (wrong, should be 'Segoe UI Symbol')
        // 'Tahoma' in Win XP for characters above 0xFFFF (wrong, should be 'Microsoft Sans Serif', I guess)
    
    // Get glyph indices for the 'sUnicode' string
    hFont = CreateFontIndirectW( &logFont );
    SelectObject( hdc, hFont );
    GCP_RESULTSW infoStr = {0};
    infoStr.lStructSize = sizeof(GCP_RESULTSW);
    wchar_t tempStr[wcslen(sUnicode)];  
    wcscpy( tempStr, sUnicode );
    infoStr.lpGlyphs = tempStr;
    infoStr.nGlyphs = wcslen(tempStr);
    GetCharacterPlacementW( hdc, tempStr, wcslen(tempStr), 0, &infoStr, GCP_GLYPHSHAPE );
    ReleaseDC( hwnd, hdc );

    // Return one string
    if( infoStr.lpGlyphs[0] == 3 || // for Windows 7 and 10
        infoStr.lpGlyphs[0] == 0 )  // for Windows XP
        return sLimited;
    else
        return sUnicode;
}

// Callback function to intercept font creation
int CALLBACK metaFileEnumProc( HDC hdc, HANDLETABLE *table, const ENHMETARECORD *record,
                            int tableEntries, LPARAM logFont ) {
    if( record->iType == EMR_EXTCREATEFONTINDIRECTW ) {
        const EMREXTCREATEFONTINDIRECTW* fontRecord = (const EMREXTCREATEFONTINDIRECTW *)record;
        *(LOGFONTW *)logFont = fontRecord->elfw.elfLogFont;
    }
    return 1;
}

You can call it with checkGlyphExist( hWnd, L"", L"<3" );

I tested on Windows 10 and on two virtual machines: Windows 7 Professional, Windows XP SP2.
It works quite well, but two problems still remain about the fallback font that EnumEnhMetaFile retrieves when a glyph is missing in base font:

  • in Windows 7 is always Microsoft Sans Serif, but the real fallback font should be Segoe UI Symbol.
  • in Windows XP is Tahoma instead of Microsoft Sans Serif, but only for surrogate pairs characters (for BMP characters is Microsoft Sans Serif that is correct, I guess).

Can someone help me to solve this?

Community
  • 1
  • 1
Salvador
  • 786
  • 8
  • 21
  • You could ask for the width of that character in that font – Basile Starynkevitch Dec 15 '17 at 23:05
  • It may be that the control is using Uniscribe on older systems and [DirectWrite](https://msdn.microsoft.com/en-us/library/dd371569) in Windows 10. DirectWrite is supported back to Vista. – Eryk Sun Dec 16 '17 at 01:58
  • 1
    @BasileStarynkevitch Do you mean to check if the missing glyph is zero width? I tried `GetTextExtentPoint32`, `DrawTextEx(DT_CALCRECT)`, `TextOut`+`GetPath` and also if the glyph is missing they all return a full width, I suppose the width of replacement box ▯. – Salvador Dec 16 '17 at 02:27
  • @eryksun Yes but I have no idea how to intercept the Uniscribe/DirectWrite creation of a menu button to know if the glyph is found or not :-( – Salvador Dec 16 '17 at 02:55
  • 2
    Just on a side note: Usually you would pick a font that supports everything you display in your application and ship said font with your application to avoid these problems, or use images. Nonetheless interesting question. – deW1 Dec 16 '17 at 11:51
  • @deW1 Well, I'd prefer to use the system font for the menu, but also the idea to create my own font is thrilling to me! And yes, using images is an option I've considered. – Salvador Dec 16 '17 at 15:02
  • You should be able to do this in DirectWrite, but your question is in C. Add C++ tag if you need C++ answer. – Barmak Shemirani Dec 17 '17 at 10:10
  • @BarmakShemirani I'm coding my program only in C, not C++, and looking for a C solution. – Salvador Dec 17 '17 at 18:30
  • Your code seems to work nicely (except a couple of `HFONT` leaks). What's the significance of `3` in `lpGlyphs[0] == 3`? It seems to match `wgBlank` in `SCRIPT_FONTPROPERTIES`, but I thought it should be `wgDefault`. I think you answered your question. As to your main problem, you should probably give up and just use bitmaps (or icons for better DPI awareness) Consistency in font mapping is too hard. – Barmak Shemirani Dec 18 '17 at 17:46
  • @BarmakShemirani HFONT leaks? Maybe you mean I shouldn't reuse the `hFont` variable.. `3` is not `wgBlank` nor `wgDefault`, but is the glyph index of SPACE (U+0020), because `GetCharacterPlacement` replaces missing glyphs with spaces. The question is not answered: my code doesn't work in Win7 and WinXP. I'm still looking for the hard font solution – Salvador Dec 19 '17 at 21:13
  • You are creating 2 fonts with `CreateFontIndirect`. Call `DeleteObject` when you no longer need the font, see documentation. `wgBlank` is the glyph for blank space, so that fits in with your explanation. – Barmak Shemirani Dec 19 '17 at 21:42
  • I tested your code in Win10 and Win7, it shows if the character is tofu or not. That's was the title of your question. I think your final goal is to print `DrawText(L"ABC",...)` and have a consistent look on different systems. That would be difficult, and a different question. You could instead print the characters separately. For example `DrawText(L"",...)` (always using `"Segoe UI Symbol"`) followed by `DrawText(L"123",...)` using a different font. – Barmak Shemirani Dec 19 '17 at 21:45
  • @BarmakShemirani You are right: `DeleteObject(hFont)` was missing, and `3` corresponds to `wgBlank`. I would simply use `swprintf(buf,c,L"%sABC",checkGlyphExist(hwnd,L"",L":)"))` and `InsertMenu(...,buf)`. *Segoe UI Symbol* exists only from Win7, and has the emoji glyphs only from Win8. I'd like to use the fonts that the user has already installed. Apparently my code works, but actually in many cases retrieves the wrong fallback font. – Salvador Dec 28 '17 at 23:01
  • You should add that as answer, and post a different question about font fallback. You can install your own font and set the fallback font in registry. Compatibility with the old Windows XP is too ambitious. – Barmak Shemirani Dec 28 '17 at 23:16

2 Answers2

0

First you have to make sure you're using same API on both Win7 and Win10. Lower level gdi32 API is not supposed to support surrogate pairs in general I think, while newer DirectWrite does, on every level. Next thing to keep in mind is that font fallback (font linking is a different thing) data differs from release to release and it's not something user has access to, and it's not modifiable.

Second thing to check if Win7 provides fonts for symbol at U+1F5A4 in a first place, it's possible it was introduced in later versions only.

Basically if you're using system rendering functionality, older or newer, you're not supposed to control fallback most of the time, if it doesn't work for you it usually means it won't work. DirectWrite allows custom fallback lists, where you can for example explicitly assign U+1F5A4 to any font you want, that supports it, including custom fonts that you can bundle with your application.

If you want more detailed answer, you'll need to show some sources excerpts that don't work for you.

bunglehead
  • 1,104
  • 1
  • 14
  • 22
  • An upvote because it suggests the following solution: check the Windows version first! Then decide what to do for the lower version. – Jongware Dec 17 '17 at 07:29
  • I'd really like to find a solution that doesn't check the Windows version, but rather check for example if surrogate pairs are supported by the system. I've added in my question a version agnostic code that works quite well, waiting for improvements. I'm coding in C and DirectWrite is only C++, I guess. I have seen it's impossible to set a font only for my program menu, unless I make an owner drawn menu, but at that point it's easier to use some image. – Salvador Dec 18 '17 at 00:16
  • @Salvador, right, I don't mean you should check Windows version directly, it's not pretty. It's better to check if functionality is supported instead, but that could be non trivial in your case. You can use DirectWrite from C too, it's not meant for that, but it's possible with some tricks. – bunglehead Dec 18 '17 at 04:38
  • I'm trying to use DirectWrite with C, particularly `GetSystemFontFallback`, but it's too difficult task for my poor knowledge. – Salvador Dec 24 '17 at 09:22
-3

I believe the high and low 16-bit words are well defined for surrogate pairs. You should be able to identify surrogate pairs by checking the range of values for each of the 16-bit words.

For the high word it should be in the range of 0xd800 to 0xdbff For the low word it should be in the range of 0xdc00 to 0xdfff

If any two pair of "characters" meets this criteria, they are a surrogate pair.

See the wikipedia article on UTF-16 for more information.

Olan
  • 62
  • 7
  • 3
    But OPs original text already *is* a correct surrogate pair, and it displays correctly – if the font obliges. In addition, you are assuming a Font Fallback routine will change the existing string (or return a pointer to one, possibly). That is usually not the case. – Jongware Dec 16 '17 at 00:26
  • GDI supports surrogate pairs so long as the font supports the code point in question. And has since at least Win2K, – SoronelHaetir Dec 19 '17 at 06:53