8

I need to be able to use utf-8-encoded strings with log4cxx. I can print the strings just fine with std::cout (the characters are displayed correctly). Using log4cxx, i.e. putting the strings into the LOG4CXX_DEBUG() macro with a ConsoleAppender will output "??" instead of the special character. I found one solution:

LOG4CXX_DECODE_CHAR(logstring, str);
LOG4CXX_DEBUG(logstring);

where str is my input string, but this does not work. Anyone have an idea how this might work? I google'd around a bit, but I couldn't find anything useful.

arne
  • 4,514
  • 1
  • 28
  • 47

4 Answers4

4

You can use

setlocale(LC_CTYPE, "UTF-8");

to set only the character encoding, without changing any other information about the locale.

Brian Campbell
  • 322,767
  • 57
  • 360
  • 340
2

I met the same problem and searched and searched. I found this post, It may work, but I don't like the setlocaleish solution. so i made more research, finally the solution came out.

I reconfigure log4cxx and build it, the problem was solved!

add two more configure options in log4cxx:

./configure --prefx=blabla --with-apr=blabla --with-apr-util=blabla --with-charset=utf-8 --with-logchar=utf-8

hope this will help anyone who need it.

Roman
  • 2,530
  • 2
  • 27
  • 50
1

One solution is to use

setlocale(LC_ALL, "en_US.UTF-8");

in my main function. This is OK for me, but if you want more localizable applications, this will probably become hard to track/use.

arne
  • 4,514
  • 1
  • 28
  • 47
1

The first answer didn't work for me, the second one is more than i want. So I combined the two answers:

setlocale(LC_CTYPE, "xx_XX.UTF-8");  // or "xx_XX.utf8", it means the same

where xx_XX is some language tag. I tried to log strings in many languages with different alphabets (on LINUX, including Chinese, language left-to-right and rigth-to-left); so I tried:

setlocale(LC_CTYPE, "it_IT.UTF-8");

and it worked with any tested language. I cannot understand why the simple "UTF-8" without indicating a language xx_XX doesn't work, since i use UTF8 to be language-independent and one shouldn't indicate one. (If somebody know the reason also for that, would be an interesting improvement to the answer). Maybe this also depends by Operatin System.

Finally, on Linux you can get a list of the encodings by typing on shell:

# locale -a | grep utf
fresko
  • 1,890
  • 2
  • 24
  • 25