3

I haven't been able to find a way to cout a '—' character, whether I put that in the cout statement like this: cout << "—"; or use char(151), the program prints out a fuzzy undefined character. Do you guys see anything wrong with my code? Is couting a EM DASH even possible?

Edit: I've also tried wcout << L"—"; and std::wcout << wchar_t(0x2014);. Those both print nothing in my terminal.

phuclv
  • 37,963
  • 15
  • 156
  • 475
Logan Kling
  • 569
  • 2
  • 6
  • 19
  • 1
    @immibus Is there not a way to do it for both? In Linux, it prints the fuzzy rectangle character, in Visual Studio it prints a 'ú' character. – Logan Kling Oct 09 '15 at 04:34
  • [How to print Unicode character in C++?](http://stackoverflow.com/q/12015571), [Printing UTF-8 strings with printf - wide vs. multibyte string literals](http://stackoverflow.com/q/15528359), [How to output unicode characters in C/C++](http://stackoverflow.com/q/17641718). In Linux, it's quite straitforward, just use UTF-8. In Windows it's a bit difficult. If the current codepage doesn't have an em-dash, the only way is using Unicode – phuclv Oct 09 '15 at 04:35
  • @LưuVĩnhPhúc \u0151 prints a question mark. – Logan Kling Oct 09 '15 at 04:37
  • 1
    @LarryK where did I said that you print `\u0151`? – phuclv Oct 09 '15 at 04:38
  • 1
    @LưuVĩnhPhúc You never said it, but it was one of the answers in one of your links. – Logan Kling Oct 09 '15 at 04:40
  • The unicode value for em dash is `0x2014`. I thought `std::wcout << wchar_t(0x2014);` is the right thing but I get two plain old dashes as output. It could be my terminal settings. – R Sahu Oct 09 '15 at 04:40
  • 1
    @RSahu For me, `std::wcout << wchar_t(0x2014);` prints nothing. – Logan Kling Oct 09 '15 at 04:42
  • @LarryK, it obviously depends on the terminal settings. Good luck with finding a working solution. – R Sahu Oct 09 '15 at 04:45

2 Answers2

4

First of all, EM DASH is an unicode character (just making sure you do know that).

Printing unicode characters depends on what you're printing to. If you're printing to a Unix terminal (or an emulator), the terminal emulator is using an encoding that supports this character, and that encoding matches the compiler's execution encoding, then you can do what you just did above in your source code cout << "—";

If you're getting fuzzy undefined characters, it is possible that your terminal just doesn't support that character.

If you're in windows (where it is harder), you can do something like this (which is not portable):

#include <iostream>
#include <io.h>
#include <fcntl.h>

int main() {
    _setmode(_fileno(stdout), _O_U16TEXT);
    std::wcout << L"—";
}
shafeen
  • 2,431
  • 18
  • 23
0

There's no universal support for Unicode in C++ and in various terminals, so there won't be a portable solution.

The thing is that the Windows console uses codepages in console by default. It probably uses UTF-16 internally but will always convert to and from the current ANSI codepage when interacting with outside. So simply printing an UTF-16 code point like std::wcout << wchar_t(0x2014); won't work without any prior setup. You need to switch to UTF-8 by running chcp 65001 in the console or _setmode(_fileno(stdout), _O_U16TEXT); in code before printing the character out with

std::wcout << L"—";

It will not always work because of the worse Unicode support in Windows console. In many cases the characters don't appear due issues in the renderer or font, replacing with squares or ????. But in that case just copy the text out and paste to any Unicode text box then it will be displayed properly

If you're using Windows in English or some other Western European languages that use codepage 1252/ISO-8859-1 then you can print em-dash which is at the codepoint 151 simply by

cout << (char)151;

If it doesn't work then you're not on codepage 1252. You can change it to 1252 if possible or look up for em-dash in your codepage (if available)

On Linux things are much simpler because UTF-8 are used by default. So you can output the string as normal without resorting to std::wcout

std::cout << "—"; // need to make sure that std::string is in UTF-8
// or use std::cout << u8"—" to force the encoding

In fact you'll often get surprise results if you use wide strings on Linux. std::wcout << L"—" won't often work because of some possible bugs in libc


That said, Windows 10 console now supports UTF-8 perfectly and even allows to use UTF-8 as the locale so if you don't need to support Windows 7 then there's a universal method to print any Unicode strings:

std::cout << u8"—";
phuclv
  • 37,963
  • 15
  • 156
  • 475