Some information:
- I've only tried this on Linux
- I've tried both with GCC (7.2.0) and Clang (3.8.1)
- It requires C++11 or higher to my understanding
What happens when I run it
I get the expected string "abcd" repeated until it hits the position of 4094 characters. After that all it outputs is this sign "?" until the end of the file.
What do I think about this?
I think this is not the expected behavior and that it must be a bug somewhere.
Code you can test with:
#include <iostream>
#include <fstream>
#include <locale>
#include <codecvt>
void createTestFile() {
std::ofstream file ("utf16le.txt", std::ofstream::binary);
if (file.is_open()) {
uint16_t bom = 0xFEFF; // UTF-16 little endian BOM
uint64_t abcd = 0x0064006300620061; // UTF-16 "abcd" string
file.write((char*)&bom,2);
for (size_t i=0; i<2000; i++) {
file.write((char*)&abcd,8);
}
file.close();
}
}
int main() {
//createTestFile(); // uncomment to make the test file
std::wifstream file;
std::wstring line;
file.open("utf16le.txt");
file.imbue(std::locale(file.getloc(), new std::codecvt_utf16<wchar_t, 0x10ffff, std::consume_header>));
if (file.is_open()) {
while (getline(file,line)) {
std::wcout << line << std::endl;
}
}
}