1

How can I print foreign characters on the screen using C?

Here's my code, which doesn't work:

#include <stdio.h>
#include <locale.h>

int main(){

    setlocale(LC_ALL,"Turkish"); 

    printf("İ ş ğ ü ö ı");

    system("pause");
    return 0;
}
phuclv
  • 37,963
  • 15
  • 156
  • 475
Recai
  • 75
  • 2
  • 10
  • 3
    1) Add `"\n"` --> `"İ ş ğ ü ö ı\n"` to insure we do not have an I/O issue - better yet use `puts("İ ş ğ ü ö ı\n");` 2) "which doesn't work" is not specific: what _does_ happen? – chux - Reinstate Monica May 28 '14 at 18:40
  • 1
    Seems to work in cygwin. No such command as pause though. – Elliott Frisch May 28 '14 at 18:41
  • I still have the same problem, weird characters is printed instead of the ones i wrote. – Recai May 28 '14 at 18:43
  • You can try: [wprintf](http://www.cplusplus.com/reference/cwchar/wprintf/) – 001 May 28 '14 at 18:48
  • 1
    @user3684881 1) Instead of saying "weird characters is printed", post the characters that are printed - just cut and paste. 2) Try a _simple_ example: `puts("abcİdef\n");` and reports its results. – chux - Reinstate Monica May 28 '14 at 20:06
  • After changing my encoding setting in codeblocks from "WINDOWS-1252" to "WINDOWS-1254" the 'weird' characters are gone:D but im still having an incorrect output; "I s g ü ö i" – Recai May 28 '14 at 20:50
  • If you use printf then did you save the file as Windows-1254 or ISO 8859-9? Otherwise it'll be saved in Unicode and mixed up your characters. If you use wprintf then just save it as any Unicode encoding – phuclv May 29 '14 at 09:08
  • With Windows-1254 or ISO 8859-9 i have the same output "I s g ü ö i" when i use puts or printf, on the other hand, using wprintf doesn't give any output. – Recai May 29 '14 at 09:17
  • Try changing the file encoding to UTF-8 in codeblocks. – mkhatib May 29 '14 at 17:33
  • @mkhatib if he's using printf then UTF-8 will just messed up his windows-1254 string – phuclv May 30 '14 at 02:33
  • @user3684881 after changing the codepage by `chcp 1254`, `wprintf` returns exactly the string above, no `?` or incorrect letters – phuclv May 30 '14 at 02:34
  • @LưuVĩnhPhúc it returns incorrect letters – Recai May 30 '14 at 12:47
  • @user3684881 look at my picture, exactly the same as the source string – phuclv May 30 '14 at 12:51
  • The problem i have is in my codeblocks settings, i tried changing encoding, defining "unicode", and other things, but i just cant figure it out. – Recai May 30 '14 at 13:43
  • you must change the codepage in command prompt. – phuclv May 30 '14 at 13:57
  • Possibly you're using `printf` and save the file as windows-1254 while your system currently use another codepage (or reverse) then gcc cannot understand it because there's no information about codepage embedded in the file, then it implied that your file is windows-1252 or some codepage windows is using and mixed-up your string. Save the file as UTF-8 and wprintf with codepage 65001 instead – phuclv Jun 02 '14 at 03:36
  • i still have problem with writing this letter to a file! wchar_t c=L'ğ'; fputwc(c,ptr); this code doesnt work – Recai Jun 02 '14 at 20:35
  • This applies to console, not file, because the console needs tricky ways to display characters correctly in the correct charset. A file is just a storage without displaying, so just output the char or wchar_t array directly. What you should care is how to treat the file in the correct charset when displaying. Also, please edit the title and add the appropriate tag, it's a problem of windows console, not related anything to C, and please tag me if you want to reply my comments, otherwise I don't see any notifications – phuclv Jun 06 '14 at 03:30
  • @user3684881 It's not wrong, just because you didn't select the correct encoding when opening the file. See my edits – phuclv Jun 06 '14 at 04:30
  • hey did you try what I said? – phuclv Jun 20 '14 at 14:07
  • Possible duplicate of [Output unicode strings in Windows console app](https://stackoverflow.com/questions/2492077/output-unicode-strings-in-windows-console-app) – phuclv Jun 05 '17 at 11:44

4 Answers4

1

On my Windows there is no such characters in the 'Terminal' font. I think you can't print them.
But I suggest you to check this font yourself. Maybe you have a different version of it.

HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207
  • There's no "terminal font" in windows. Windows uses Unicode so any character can be printed, if it doesn't exist in a font then it'll be substitute by the same in another font. Also if he's using Turkish windows or changed the console to some Turkish page then it must display Turkish characters – phuclv May 29 '14 at 03:08
  • The windows terminal will crash if you try to print some symbols that it can't display. I had a python program with GUI that was printing some greek letters as part of debug info work fine on linux and crash for mysterious reasons on windows. The reason was that doing ```print u"Σ"``` while running in a windows terminal is a way to crash it. – LtWorf May 29 '14 at 09:12
  • @LtWorf that's probably some bug of python. Windows console use Unicode but defaults to traditional codepage, if you specify the wrong codepage, it displays `?` or substitute to some other similar characters. I've never seen a console crash because of those characters – phuclv May 30 '14 at 04:41
  • I guess win8 finally fixed that. If a bug of python can crash the entire terminal emulator python is printing to. I suspect it's a bug of the terminal emulator. – LtWorf May 30 '14 at 11:19
1

If you're using a narrow charset then you need to make sure that the terminal/console is using the same charset and the source code file is encoded in the correct encoding, otherwise of course the system will misinterpret the character codes

To set the charset in the console run chcp. For example to use code page Windows-1254 run chcp 1254. You can use SetConsoleOutputCP to set the code page programmatically, like SetConsoleOutputCP(1254)

turkish chars

However you should avoid the legacy ANSI code pages and use Unicode instead. The current preferred way on Windows is to output Unicode characters as wide char with wprintf. You may need to set the mode to wide first with

int result = _setmode(_fileno(stdout), _O_U16TEXT);

then

wprintf(L"İ ş ğ ü ö ı");

See also wprintf manual in Windows, Linux or Mac. However on POSIX systems UTF-8 is preferred

On older Windows UTF-8 support on console is not very good, but it's increasingly getting better, and Windows 10 even supports UTF-8 as a locale so you can just call SetConsoleOutputCP(CP_UTF8); or SetConsoleOutputCP(65001); (or run chcp 65001 in the console) and it'll work immediately, provided that you saved the source code as UTF-8. Remember to also set the font to the one that supports those characters like Lucida Console or Consolas. The default raster font contains very a limited number of characters and appears with a lot of aliasing. It also doesn't work well on modern hidpi displays

There are already lots of questions about outputting Unicode on this site like Output unicode strings in Windows console app or UTF-8 character in .NET Console Application. Please have a look and try to see which one fits you.

Edit

When you use

wchar_t c=L'ğ';
fputwc(c,ptr); 

you're printing to a file and not the console. In that case just the stream of bytes is saved into the file. When you open the file again, it's the job of the editor to treat the bytes in the correct charset and print it correctly. For example the character "ğ" is stored as c4 9f in UTF-8 and when open the file as UTF-8, the editor knows that it represents the char "ğ" to display

Unfortunately there's no character encoding information embedded in a text file so the editor must choose one. Remember There Ain't No Such Thing As Plain Text (must read). A simple editor may just choose to open the file as ANSI in the current Windows codepage and the characters won't be displayed correctly if the original encoding is not that one and you'll just see garbage

Some more advanced editors like Notepad++ or MS Word will try to guess the encoding of the file. But as with any guessing, it can be wrong and the result is again a file with garbage

The simplest solution is to add a BOM to the beginning of the file so the editor can recognize the encoding easily. If your files doesn't contain a BOM you need to tell the editor to read the file in the correct encoding if the encoding is wrong (for wchar_t on Windows like that it's UTF-16LE). For example in Notepad++ it's this menu

Unfortunately the OP didn't edit the question to show what was tried, there's nothing more I can explain

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • And how can i check that? – Recai May 28 '14 at 21:55
  • What do you mean by "check"? If you mean checking whether the output is correct then why don't print it out? Just `wprintf(L"İ ş ğ ü ö ı");` is enough – phuclv May 29 '14 at 03:01
  • No I didn't mean that , you said "you need to make sure that the terminal/console is using the same charset and the source code file is encoded in the correct encoding", how can i check that? – Recai May 29 '14 at 08:00
  • Get the console [codepage](http://stackoverflow.com/questions/1259084/what-encoding-code-page-is-cmd-exe-using) then just save the file as the same charset as what your terminal are using. But don't use that, use Unicode instead, it can run in any computer – phuclv May 29 '14 at 08:46
0

Your code works: http://ideone.com/K9hrv5

setlocale(LC_ALL,"Turkish");
printf("İ ş ğ ü ö ı");

The only issue is that you have to set your terminal locale as well before executing your c program's output binary.

Setting your terminal locale works:

enter image description here

Prasanth
  • 5,230
  • 2
  • 29
  • 61
0

setlocale only affects the runtime locale. It doesn't make your compiler support extra source file characters.

You may need to specify the non-ASCII characters in your source file by using character constants (e.g. \xF1 for the character with code 241).

M.M
  • 138,810
  • 21
  • 208
  • 365
  • How can I specify these characters in my source file? – Recai May 30 '14 at 12:49
  • by character constants, he's already said that clearly. But better, use Unicode, you'll avoid lots of problem. The way to detect and output UTF-8 characters has already mentioned in my post and [here](http://stackoverflow.com/a/17177904/995714), please have a look – phuclv Jun 02 '14 at 03:59