2

I'm running into some really weird behaviour when I use imbue to set the locale for cin.

// example.cpp
#include <iostream>
#include <iomanip>
#include <locale>
int main(){
# ifdef LOCALE
  std::cin.imbue(std::locale(LOCALE));
# endif
  long temp;
  const bool status = static_cast<bool>(std::cin >> temp);
  std::cout << std::boolalpha << status << " " << temp << std::endl;
}

I can compile and run this code without issue if I don't imbue the current locale.

$ g++ example.cpp -o no-imbue -std=c++1y -stdlib=libc++ -Wall -Wextra -Werror
$ echo 100 | no-imbue
true 100
$ echo 1001 | no-imbue
true 1001

However, if I imbue the current locale, std::cin >> temp starts failing for four digit numbers:

$ g++ example.cpp -o imbue-empty -DLOCALE='""' -std=c++1y -stdlib=libc++ -Wall -Wextra -Werror
$ echo 100 | imbue-empty
true 100
$ echo 1001 | imbue-empty
false 1001

Using "en_US.UTF-8" as the locale name instead of "" seems to have the same effect.

$ g++ example.cpp -o imbue-utf8 -DLOCALE='"en_US.UTF-8"' -std=c++1y -stdlib=libc++ -Wall -Wextra -Werror
$ echo 100 | imbue-utf8
true 100
$ echo 1001 | imbue-utf8
false 1001

I'm on OSX using clang-600.0.57

$ g++ --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix

Is this a bug with the compiler, or am I doing something wrong?

rampion
  • 87,131
  • 49
  • 199
  • 315
  • 1
    Ninjaedit> Reproduced for `"en_US.UTF-8"` locale, however cannot be reproduced for the empty locale case (`Apple LLVM version 7.3.0 (clang-703.0.29)`). – dfrib Aug 08 '16 at 17:17
  • It [looks like there were locale issues with that version of the compiler](https://gcc.gnu.org/ml/gcc-bugs/2016-02/msg01518.html) – rampion Aug 08 '16 at 17:19
  • @dfri: Yeah, the empty case didn't reproduce when I ran it within a cram testfile, but did at the command line. – rampion Aug 08 '16 at 17:20
  • Have a look at [this thread](http://stackoverflow.com/questions/1745045) as well as [this one](http://stackoverflow.com/questions/11190072/), possibly the same/a related issue. – dfrib Aug 08 '16 at 17:30
  • 1
    Looks like a [libc++/libstdc++ difference](http://coliru.stacked-crooked.com/a/86fe0dc79bbde2b2). – T.C. Aug 08 '16 at 17:55

1 Answers1

2

If you input 1,001, your program should print true.

The en_US locale expects a comma between each group of three digits. Because you didn't provide one, std::num_get::get() sets failbit on std::cin. See the link for more detail, but the relevant excerpts are:

Stage 2: character extraction

If the character matches the thousands separator (std::use_facet<std::numpunct<charT>>(str.getloc()).thousands_sep()) and the thousands separation is in use at all std::use_facet<std::numpunct<charT>>(str.getloc()).grouping().length() != 0, then if the decimal point '.' has not yet been accumulated, the position of the character is remembered, but the character is otherwise ignored. If the decimal point has already been accumulated, the character is discarded and Stage 2 terminates.

And

Stage 3: conversion and storage

After this, digit grouping is checked. if the position of any of the thousands separators discarded in Stage 2 does not match the grouping provided by std::use_facet<std::numpunct<charT>>(str.getloc()).grouping(), std::ios_base::failbit is assigned to err.

Miles Budnek
  • 28,216
  • 2
  • 35
  • 52
  • The wording here only covers *thousands separators discarded*, it doesn't state "or absent", making the use of the separators optional: – kfsone Aug 08 '16 at 18:00
  • I can confirm that `echo 1,001 | imbue-utf8` outputs `true 1001`, so it looks this is it - the parser is requiring the thousands separator for this locale. Whether that behaviour is a bug or not is one for the forums. – rampion Aug 08 '16 at 18:05
  • The standard makes it clearer: (22.4.2.1.2/4) `Digit grouping is checked. That is, the positions of discarded separators is examined for consistency with use_facet >(loc).grouping(). If they are not consistent then ios_- base::failbit is assigned to err.` – kfsone Aug 08 '16 at 18:05
  • That wording is still seems ambiguous, and it seems libstdc++ chose a different interpretation than libc++. Just checked my Ubuntu box, and the thousands separator isn't needed there. – Miles Budnek Aug 08 '16 at 18:16
  • @kfsone so if nothing was discarded, there is nothing to check, right? – Cubbi Aug 09 '16 at 15:21