21

I would like to convert a char* string to a wchar* string in C.

I have found many answers, but most of them are for C++. Could you help me?

Thanks.

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
Crupuk
  • 297
  • 1
  • 3
  • 9
  • 2
    What is the original encoding in your `char*`? UTF8? ANSI? What is the `sizeof(wchar)` on your system and what encoding does it rely upon? UCS-2 (16bit)? UCS-4 (32bit)? – Benoit Jan 28 '11 at 08:23
  • @Benoit: Whoa... I thought `sizeof(wchar)` was always 2, no? – user541686 Jan 28 '11 at 08:24
  • @Mehrdad: It is not necessarily 2. It is implementation-defined. If programming on Windows, it has a size of two bytes and holds UTF-16, with double wchar_t's for surrogate pairs. – Benoit Jan 28 '11 at 08:25
  • @Benoit: o__O I did *not* know it's implementation-defined... interesting, thanks for the info. – user541686 Jan 28 '11 at 08:26
  • It's on unix system, so i guess it doesn't matter no ? – Crupuk Jan 28 '11 at 10:22
  • I forget which system (Linux maybe?), that uses a 4-byte `wchar_t` encoded with UTF-32. – Remy Lebeau Jan 28 '11 at 22:44
  • Yes 4 byte, printf("Size of wchar_t : %d",sizeof(wchar_t)); -> "Size of wchar_t : 4" So, how can i convert string into unidocde ? – Crupuk Jan 30 '11 at 19:06

5 Answers5

30

Try swprintf with the %hs flag.

Example:

wchar_t  ws[100];
swprintf(ws, 100, L"%hs", "ansi string");
Nick Dandoulakis
  • 42,588
  • 16
  • 104
  • 136
  • i will try this evening , for now i don't have access to a shell.Thanks – Crupuk Jan 28 '11 at 10:30
  • @NickDandoulakis I think this answer could be very useful, however I found out that swprintf could have 2 possible interfaces, could you please take a look at this question? http://stackoverflow.com/q/17716763/2436175 – Antonio Jul 18 '13 at 09:45
  • @Antonio the interface that requires the buffer length is the portable one. – Nick Dandoulakis Jul 18 '13 at 12:14
  • @NickDandoulakis It won't compile on Mingw 4.5.2 for example, so unfortunately is not general! – Antonio Jul 18 '13 at 17:21
  • this is a good solution when cross compiling to mingw on a linux platform – Octopus Aug 16 '13 at 19:33
  • Note that this appears to work with any of the printf functions, fortunately. – rookie1024 Jun 29 '15 at 22:13
  • Can you explain why %hs is the correct flag ? I tried %ls but it doesn't work (at least on windows). According to [this article](https://devblogs.microsoft.com/oldnewthing/20190830-00/?p=102823) %hs is used for "narrow" string while %ls is used for "wide" string. I'm using wchar_t so I thought that the correct choise was %ls for "wide" string but I was wrong. – Bemipefe Apr 30 '20 at 18:43
  • @Bemipefe, the `%hs` specifies the type of the argument, e.g. "ansi string" which is a narrow string. – Nick Dandoulakis Apr 30 '20 at 19:38
  • @NickDandoulakis Thanks. I feel a little bit stupid. You are right. The whole printed string will be a wide string but you have to specify the original (input) format. – Bemipefe May 01 '20 at 16:11
  • @Bemipefe, no problem. [No such thing as a stupid question](https://en.wikipedia.org/wiki/No_such_thing_as_a_stupid_question) – Nick Dandoulakis May 01 '20 at 16:32
5

setlocale() followed by mbstowcs().

caf
  • 233,326
  • 40
  • 323
  • 462
user541686
  • 205,094
  • 128
  • 528
  • 886
  • This is OK as long as the input is an ANSI string. – Benoit Jan 28 '11 at 08:24
  • @Benoit: Yeah, there's obviously more to string conversion than calling just a single function. But I didn't give any details since I think this is all the OP's looking for... – user541686 Jan 28 '11 at 08:26
  • The imput come from LdapDirectory, so i guess it's an UTF8 ? – Crupuk Jan 28 '11 at 10:29
  • 3
    @Benoit: There's no such thing as an "ANSI string". This will work if the original string is in the multibyte format corresponding to the currently set locale. – caf Jan 28 '11 at 11:53
  • I already have found this function, but i can't use it correctly, i just want to encode a string to unicode to send in a mail subject header. Thanks to you – Crupuk Jan 30 '11 at 19:07
  • @Crupuk: What format is the source string in? If it's just ASCII, and you want to use it in a UTF-8 mail header, then no transformation is needed. – caf Jan 31 '11 at 01:08
4

what you're looking for is

mbstowcs

works just like the copy function from char* to char*

but in this case you're saving into a wchar_t*

Franky Rivera
  • 553
  • 8
  • 20
0

If you happen to have the Windows API availiable, the conversion function MultiByteToWideChar offers some configurable string conversion from different encodings to UTF-16. That might be more appropriate if you don't care too much about portability and don't want to figure out exactly what the implications of different locale settings are to the string converison.

Christoffer
  • 12,712
  • 7
  • 37
  • 53
-4

if you currently have ANSI chars. just insert an 0 ('\0') before each char and cast them to wchar_t*.

mehrdad safa
  • 1,081
  • 9
  • 10