1

Since a few days I was trying to get a C++ code that converts the Turkish I character to lowercase ı correctly on VS2022 on Windows.
As I understand, Turkish I has the same Unicode as regular Latin I, thus, I need to define the locale as Turkish before converting, I used the following code:

#include <clocale>
#include <cwctype>
#include <fstream>
#include <iostream>
#include <locale>
#include <string>

int main() {
    std::wstring input_str = L"I";
    std::setlocale(LC_ALL, "tr_TR.UTF-8"); // This should impact std::towlower
    std::locale loc("tr_TR.UTF-8");
    std::wofstream output_file("lowercase_turkish.txt");
    output_file.imbue(loc);

    for (wchar_t& c : input_str) {
        c = std::towlower(c);
    }

    output_file << input_str << std::endl;
    output_file.close();
}

It worked fine on Linux, outputing ı, but didn't work correctly on Windows and it outputed i inplace of ı.
After some research I think it is a bug in Windows unicode/ascii mapping, so I went to an alternative solution, using an external library called boost, here is my code:

#include <boost/algorithm/string.hpp>
#include <string>
#include <locale>
#include <iostream>
#include <fstream>
using namespace std;

using namespace boost::algorithm;

int main()
{
    std::string s = "I";
    std::locale::global(std::locale{ "Turkish" });
    to_lower(s);
    ofstream outfile("output.txt");
    outfile << s << endl;
    outfile.close();
    return 0;
}

again, outputing i inplace of ı. also using to_lower_copy outputs the same.

user2401856
  • 468
  • 4
  • 8
  • 22
  • 2
    `boost::algorithm::to_lower` is `void` and worls on the `std::string` you pass in by reference. You should also set the locale before doing any conversions. – Ted Lyngmo Feb 19 '23 at 13:42
  • @TedLyngmo I updated my code, is it right? as it still outputs `i` – user2401856 Feb 19 '23 at 13:46
  • Does setting the locale work? You don't check that it does. – Ted Lyngmo Feb 19 '23 at 13:58
  • Turkish dotless I has Unicode value of 0x131, so it does not fit in a normal char. In other words to solve this you have an encoding issue as well as a translation issue. You might have more luck with `std::wstring` – john Feb 19 '23 at 14:08
  • 3
    You may want to use boost::locale::gen, look [here](https://stackoverflow.com/questions/19751189/boostlocaleto-lower-throw-bad-cast-exception). Also bear in mind that boost locales may be backed by stdlib, winapi, or ICU. You want the ICU backend, see [here](https://www.boost.org/doc/libs/1_54_0/libs/locale/doc/html/using_localization_backends.html) – n. m. could be an AI Feb 19 '23 at 14:42

0 Answers0