Since a few days I was trying to get a C++ code that converts the Turkish I
character to lowercase ı
correctly on VS2022 on Windows.
As I understand, Turkish I
has the same Unicode as regular Latin I
, thus, I need to define the locale as Turkish before converting, I used the following code:
#include <clocale>
#include <cwctype>
#include <fstream>
#include <iostream>
#include <locale>
#include <string>
int main() {
std::wstring input_str = L"I";
std::setlocale(LC_ALL, "tr_TR.UTF-8"); // This should impact std::towlower
std::locale loc("tr_TR.UTF-8");
std::wofstream output_file("lowercase_turkish.txt");
output_file.imbue(loc);
for (wchar_t& c : input_str) {
c = std::towlower(c);
}
output_file << input_str << std::endl;
output_file.close();
}
It worked fine on Linux, outputing ı
, but didn't work correctly on Windows and it outputed i
inplace of ı
.
After some research I think it is a bug in Windows unicode/ascii mapping, so I went to an alternative solution, using an external library called boost
, here is my code:
#include <boost/algorithm/string.hpp>
#include <string>
#include <locale>
#include <iostream>
#include <fstream>
using namespace std;
using namespace boost::algorithm;
int main()
{
std::string s = "I";
std::locale::global(std::locale{ "Turkish" });
to_lower(s);
ofstream outfile("output.txt");
outfile << s << endl;
outfile.close();
return 0;
}
again, outputing i
inplace of ı
. also using to_lower_copy
outputs the same.