1

On Windows with Visual Studio 2017 I can use the following code to uppercase a u32string (which is based on char32_t):

#include <locale>
#include <iostream>
#include <string>

void toUpper(std::u32string& u32str, std::string localeStr)
{
    std::locale locale(localeStr);

    for (unsigned i = 0; i<u32str.size(); ++i)
        u32str[i] = std::toupper(u32str[i], locale);
}

The same thing is not working with macOS and XCode. I'm getting such errors:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__locale:795:44: error: implicit instantiation of undefined template 'std::__1::ctype<char32_t>'
return use_facet<ctype<_CharT> >(__loc).toupper(__c);

Is there a portable way of doing this?

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
Bastl
  • 883
  • 2
  • 10
  • 27
  • Feels like you're missing an `#include`. Hard to tell without a [mcve]. – Captain Obvlious Aug 08 '17 at 21:38
  • I do not think that I miss a header file. I've added the includes to the example. – Bastl Aug 08 '17 at 21:42
  • Maybe your compiler's RTL simply does not implement `ctype`? On a side note, this seems like something you should be using `std::transform()` for instead of a manual loop, eg: `std::locale loc(localeStr); std::transform(u32str.begin(), u32str.end(), u32str.begin(), [&loc](char32_t c) -> char32_t { return std::toupper(c, loc); });` – Remy Lebeau Aug 08 '17 at 21:45
  • I think so. Is there a better (portable) way of doing this? – Bastl Aug 08 '17 at 21:48
  • 1
    regarding this: https://stackoverflow.com/a/41316811/2007933 it seems that `ctype` is not supported – Bastl Aug 08 '17 at 21:52
  • Since the accepted answer breaks on Windows when the input is not restricted to the Basic Multilingual Plane, you might need to fall back on a library such as ICU on an implementation that doesn’t support `std::ctype`. – Davislor Aug 03 '23 at 17:03
  • Otherwise, you’re in `#ifdef` territory. – Davislor Aug 03 '23 at 17:17

1 Answers1

0

I have found a solution:

Instead of using std::u32string I'm now using std::string with utf8 encoding. Conversion from std::u32string to std::string (utf8) can be done via utf8-cpp: http://utfcpp.sourceforge.net/

It's needed to convert the utf8 string to std::wstring (because std::toupper is not implemented on all platforms for std::u32string).

void toUpper(std::string& str, std::string localeStr)
{
    //unicode to wide string converter
    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;

    //convert to wstring (because std::toupper is not implemented on all platforms for u32string)
    std::wstring wide = converter.from_bytes(str);

    std::locale locale;

    try
    {
        locale = std::locale(localeStr);
    }
    catch(const std::exception&)
    {
        std::cerr << "locale not supported by system: " << localeStr << " (" << getLocaleByLanguage(localeStr) << ")" << std::endl;
    }

    auto& f = std::use_facet<std::ctype<wchar_t>>(locale);

    f.toupper(&wide[0], &wide[0] + wide.size());

    //convert back
    str = converter.to_bytes(wide);
}

Note:

  • On Windows localeStr has to be something like this: en, de, fr, ...
  • On other Systems: localeStr must be de_DE, fr_FR, en_US, ...
Bastl
  • 883
  • 2
  • 10
  • 27
  • Note that this will not work on Windows systems where a `wstring` is a UTF-16 string containing surrogate pairs. (This violates the language standard, but Microsoft decided on the Windows API before there were sipplementary planes, and wasn’t about to break it.) – Davislor Aug 03 '23 at 16:48