4

C++11 introduced the c16rtomb()/c32rtomb() conversion functions, along with the inverse (mbrtoc16()/mbrtoc32()). c16rtomb() clearly states in the reference documentation here:

The multibyte encoding used by this function is specified by the currently active C locale

The documentation for c32rtomb() states the same. Both the C and C++ versions agree that these are locale-dependent conversions (as they should be, according to the naming convention of the functions themselves).

However, MSVC seems to have taken a different approach and made them locale-independent (not using the current C locale) according to this document. These conversion functions are specified under the heading Locale-independent multibyte routines.

C++20 adds to the confusion by including the c8rtomb()/mbrtoc8() functions, which if locale-independent would basically do nothing, converting UTF-8 input to UTF-8 output.

Two questions arise from this:

  1. Do any other compilers actually follow the standard and implement locale-dependent Unicode multibyte conversion routines? I couldn't find any concrete information after extensive searching.
  2. Is this a bug in MSVC's implementation?
owacoder
  • 4,815
  • 20
  • 47
  • 1
    Curiously https://stackoverflow.com/questions/53148386/c-unicode-how-do-i-apply-c11-standard-amendment-dr488-fix-to-c11-standard-funct quotes the same page as saying "Convert a UTF-16 wide character into a multibyte character in the current locale." so apparently the documentation *changed*? – Mooing Duck Dec 28 '22 at 18:04
  • 1
    @MooingDuck yep. https://github.com/MicrosoftDocs/cpp-docs/commit/ea5f98a72ec7a0341f6a6acce84ee1f0bfb41b5b – Osyotr Dec 29 '22 at 00:12
  • This appears to be a bug in MSVC's implementation, but I can't fathom *why* they did such a thing. – Mooing Duck Dec 29 '22 at 17:34
  • Relevant: see [this answer](https://stackoverflow.com/a/42527030/5264388) on SO. – owacoder Jul 30 '23 at 10:47

0 Answers0