2
#include <locale>
int main(){
    std::locale::global(std::locale("en_GB.utf8"));
}

From my reading this is the way to set a locale in my C++ program. But what string should I pass to the std::locale() constructor/object? if I want Arabic locale? Does passing "ar_AR.utf8" will work?

Meaning if I write:

#include <locale>
int main(){
    std::locale::global(std::locale("ar_AR.utf8"));
}

Will this work? Because my plan is that to eventually write to a file. Bear in mind my current locale is "C".

I know this because I wrote

#include <iostream>
#include <locale>
int main(){
    std::cout << std::locale().name() << std::endl;
}

and it prints C.

#include <iostream>
#include <locale>
int main(){
    std::locale::global(std::locale("ar_AR.utf8"));
    std::cout << std::locale().name()<< std::endl;
}

It prints ar_AR.utf8.

Later I try like this.

#include <iostream>
#include <locale>
int main() {
    std::locale::global(std::locale("ar_AR.utf8"));
    std::cout << 'ت' << std::endl;
    std::cin.get();
}

It prints ?.

#include <iostream>
#include <locale>
int main() {
    std::locale::global(std::locale("ar_AR.utf8"));
    std::cout << L"ت" << std::endl;
    std::cin.get();
}

Using the string literal L"ت" and it prints a number 00007FF634AA0310.

#include <iostream>
#include <locale>
int main() {
    std::locale::global(std::locale("ar_AR.utf8"));
    std::cout << "ت" << std::endl;
    std::cin.get();
}

If not wide string without L it prints ?.

#include <iostream>
#include <locale>
int main() {
    std::locale::global(std::locale("ar_AR.utf8"));
    std::cout << 2.5 << std::endl;
    std::cin.get();
}

Tried to print 2.5 and 2.5 was printed.

Benjamin Buch
  • 4,752
  • 7
  • 28
  • 51
beginner
  • 27
  • 1
  • 6
  • 1
    what happened when you tried? – 463035818_is_not_an_ai Mar 15 '23 at 08:35
  • it printed "ar_AR.utf8" string but that does not mean it work..maybe i should try with `std::ofstream` or maybe `std::wofstream` – beginner Mar 15 '23 at 08:42
  • "but that does not mean it work." why not? The thing is, even if someone tells you "yes" you will still need to try and eventually it will work for you or not, we cannot know if it works for you before you try – 463035818_is_not_an_ai Mar 15 '23 at 08:45
  • 1
    Whether it works depends on whether your system supports that particular locale. If it doesn't support it at all, an exception will be thrown. Why don't you try to output something locale-dependent, like dates? – molbdnilo Mar 15 '23 at 08:46
  • for example here https://godbolt.org/z/PK67Trsrc the locale is not available – 463035818_is_not_an_ai Mar 15 '23 at 08:47
  • i tried writing `std::cout<<'ت'< – beginner Mar 15 '23 at 08:50
  • If setting the locale throws no error, and printing the locale prints the expected, but using it fails then thats important information. Please edit your question to show the code you used – 463035818_is_not_an_ai Mar 15 '23 at 09:02
  • What happens if you make that "character" a string literal instead of character literal? UTF-8 is multi-byte. I'm not sure if what you're trying to do here will work as written. – paddy Mar 15 '23 at 09:25
  • it prints a number `00007FF634AA0310` – beginner Mar 15 '23 at 09:30
  • 1
    Not a _wide_ string... `std::cout` does not operate on those (although `std::wcout` would have worked). Like this: https://godbolt.org/z/zEq9oceaG -- and note that if you enable compiler warnings, it should have complained about the multi-character character constant. – paddy Mar 15 '23 at 09:34
  • it print `?`@paddy – beginner Mar 15 '23 at 09:36
  • In that case, I must ask: does your terminal use a font that supports this character? – paddy Mar 15 '23 at 09:37
  • What are your OS and compiler? Note that `'ت'` is always wrong, and `std::cout << L"ت" ` is always wrong too. `std::cout << "ت" << std::endl;` may or may not work as expected depending on your environment. – n. m. could be an AI Mar 15 '23 at 09:38
  • There are two issues here, one is encoding of [string literals](https://en.cppreference.com/w/cpp/language/string_literal), the other one is locale. Although both are related to (natural) language, they have little to do with each other. Locale controls e.g. the formatting of dates or decimal numbers. Try outputting 2.5 (I dimly recall Arabic uses comma as a decimal separator) to confirm that. – Friedrich Mar 15 '23 at 09:41
  • @n.m. windows microsoft 11 compiler i am not sure i just use microsoft visual studio from microsoft website – beginner Mar 15 '23 at 09:44
  • @Friedrich `2.5` is printed out – beginner Mar 15 '23 at 09:47
  • because yesterday I downloaded and install keyboard support from microsoft for the arabic language – beginner Mar 15 '23 at 09:51
  • under Time & Language > Language & Region >Add a language – beginner Mar 15 '23 at 09:53
  • Works for me just fine on Windows 11 using MSVC 2022 (14.32.31326). Make sure your terminal is using UTF code page (65001). (note using `std::cout << "ت" << std::endl;`, your two other attempts are *totally wrong*, don't waste your time). – n. m. could be an AI Mar 15 '23 at 09:53
  • also in the project setting i use UNICODE as character set but in the source code i use the utf-8 character – beginner Mar 15 '23 at 09:54
  • @n.m. is it like this https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8 on how to set the terminal code page to UTF? – beginner Mar 15 '23 at 10:02
  • @Friedrich (1) The arabic locale on Windows does not use comma as a decimal separator. The French locale does. (2) The global locale is irrelevant. It does not affect `std::cout` at all. You need to imbue a locale to `std::cout` to change its locale (but this will not help render the correct Arabic character). – n. m. could be an AI Mar 15 '23 at 10:05
  • 1
    I have no idea, I am using Windows terminal app (not the obsolete console) I don't think I have ever modified the registry. I have 65001 as the default code page. Before mucking with the registry, try typing `chcp 65001` and enter in your terminal. – n. m. could be an AI Mar 15 '23 at 10:09
  • 1
    I think you *probably* need [this](https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window) instead but don't quote me on that. – n. m. could be an AI Mar 15 '23 at 10:17

1 Answers1

0

You need to set it on the stream level by imbue.

#include <iostream>

int main(){
    std::cout << 2.5 << '\n';
    std::cout.imbue(std::locale("de_DE.UTF-8"));
    std::cout << 2.5 << '\n';
}
2.5
2,5
Benjamin Buch
  • 4,752
  • 7
  • 28
  • 51
  • i tried to make the compiler treat warning as error and it says warning C4566: character represented by universal-character-name '\u062A' cannot be represented in the current code page (1252) – beginner Mar 16 '23 at 06:46
  • i tried setting the code page of the terminal to 65001 by typing `chcp 65001` in the `Developer Command Prompt` and then manually compile the source code named `main.cpp` by typing `cl /EHsc main.cpp` then the compiler still complaint that the current terminal's code page is still `1252` whereas i already set it to `65001` – beginner Mar 16 '23 at 08:05
  • @beginner That's an issue with Microsoft Windows Terminals. They have support for Unicode, but you must [explicitly enable](https://stackoverflow.com/questions/388490/how-to-use-unicode-characters-in-windows-command-line) it. C++23 std::print will fix this, but it's not implemented yet. You can use the [fmt](https://github.com/fmtlib/fmt) with [fmt::print](https://fmt.dev/latest/api.html) instead. Note that character encoding is something very different from locals! – Benjamin Buch Mar 16 '23 at 10:32