-1
#include <iostream>
#include <Windows.h>
#include <locale>
#include <string>
#include <codecvt>
typedef wchar_t* LPWSTR, *PWSTR;

template <typename Facet>
struct deletable_facet : Facet
{
    using Facet::Facet;
};

int main(int argc, char *argv[])
{
    std::cout << argv[0] << std::endl;

    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
    //std::wcout << converter.from_bytes(argv[0]) << std::endl; // range error


    std::wstring_convert<deletable_facet<std::codecvt<wchar_t, char, std::mbstate_t>>> conv;
    std::wstring ns = conv.from_bytes(argv[0]);
    std::wcout << ns << std::endl;

    wchar_t filename[MAX_PATH];
    //GetModuleFileName(NULL,filename,MAX_PATH); // cant convert wstring_t* to char*
    GetModuleFileNameW(NULL,filename,MAX_PATH);
    std::wcout << filename << std::endl;


    getchar();
    return 0;
}

Output:

 C:\Users\luka\Desktop\ⁿ?icΣ\unicode.exe
 C:\Users\luka\Desktop\ⁿ?icΣ\unicode.exe
 C:\Users\luka\Desktop\ⁿ

Actual name of the folder is üлicä

Ive been trying many many different ways for about 2 hours now, and as far as ive seen people suggested GetModuleFileName , but as you can see that returns a conversion error (typedef wchar_t* LPWSTR, *PWSTR; isnt fixing it).

So is there any way to to get the current folder path in unicode , and get the rest of the input arguments to unicode (non-latin characters)

Luka Kostic
  • 314
  • 3
  • 12
  • Are you writing to a file or to a terminal? – P.W Jan 12 '19 at 10:26
  • cmd, but when i cd to the folder its displayed correctly so that shouldnt be a problem unless it only shows c++ unicode incorrectly – Luka Kostic Jan 12 '19 at 10:46
  • tho when i do std::wcout< – Luka Kostic Jan 12 '19 at 10:47
  • 2
    Windows doesn't support UTF-8 in `argv` , so taking an argv element and pretending it's a UTF-8 string simply won't work. If you are programming for Windows, your best bet is probably to use `wmain` and `wchar_t** argv`. – n. m. could be an AI Jan 12 '19 at 11:01
  • wmain results in undefined reference to WinMain – Luka Kostic Jan 12 '19 at 11:02
  • C:/MinGW/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/lib/../lib\libmingw32.a(lib64_libmingw32_a-crt0_c.o):crt0_c.c:(.text.startup+0x2e): undefined reference to `WinMain' collect2.exe: error: ld returned 1 exit status – Luka Kostic Jan 12 '19 at 11:03
  • If you are using a Microsoft compiler, use `wmain` according to Mucrosoft documentation. It is not as simple as replacing `main`, you also need to change your project settings. I have no idea how mingw copes with this problem, you probably want to consult mingw documentation. Google for "mingw unicode" or something. You might find something like [this](https://sourceforge.net/p/mingw-w64/wiki2/Unicode%20apps/), or perhaps something else. – n. m. could be an AI Jan 12 '19 at 11:06

4 Answers4

1

The usage for GetModuleFileName is correct. You should see the expected result with MessageBoxW(0, filename, 0, 0);

The problem is in printing L"üлicä" on Windows console.

Try printing "üлicä" on the console:

int main(int argc, char *argv[])
{
    DWORD count;
    std::wstring str = GetCommandLineW() + (std::wstring)L"\n";
    WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), str.c_str(), str.size(), &count, 0);
    MessageBoxW(0, str.c_str(), 0, 0);

    wchar_t filename[MAX_PATH];
    GetModuleFileNameW(0, filename, MAX_PATH);
    WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), filename, wcslen(filename), &count, 0);
    return 0;
}

In Visual Studio you can also use _setmode to enable usage of std::wcout/std::wcin

You also have optional entry point wmain(int argc, wchar_t *argv[]) which provides argv in UTF16 encoding.

The main entry point provides argv in ANSI encoding (not UTF8 encoding). ANSI can loose information, unlike Unicode.

Barmak Shemirani
  • 30,904
  • 6
  • 40
  • 77
0

This probably is related not to the program but the console, I suggest you try to output into a file and check if the encoding is correct.

You can do that using freopen:

int main(int argc, char *argv[]){ freopen("output-file-name.txt", "w", stdout); /*rest of code*/ }

If problem persists, try using visual studio along with _setmode(..., _O_U16TEXT) just before using wcout as described here: https://stackoverflow.com/a/9051543/9541897

  • How do i use freopen("output-file-name.txt", "w", stdout); ? Is it supposed to print all the previous couts or do i pass it a wstring somewhere? And do i need to close it? – Luka Kostic Jan 12 '19 at 11:00
  • Just add it right below `int main(){`. `int main(){ freopen("out.txt", "w", stdout); /*rest of code */ }` – Moustafa El-Sayed Jan 12 '19 at 11:03
  • File is empty. here is the code: `` int main(int argc, char* argv[]) { std::wcout << L"English -- Ελληνικά -- Español." << std::endl; freopen("output-file-name.txt", "w", stdout); std::wcout< – Luka Kostic Jan 12 '19 at 11:05
  • I know this sounds weird but when I commented out the first `wcout` line and removed the `L` it just worked perfectly!! Try using this code: `int main(int argc, char* argv[]) { freopen("output-file-name.txt", "w", stdout); std::wcout<<"üлicä areüлicä"< – Moustafa El-Sayed Jan 12 '19 at 11:14
  • That actually saves nicely so maybe there is hope – Luka Kostic Jan 12 '19 at 11:19
  • wchar_t filename[MAX_PATH]; GetModuleFileNameW(NULL,filename,MAX_PATH); std::wcout << filename << std::endl; still saves just the first one tho – Luka Kostic Jan 12 '19 at 11:21
  • This leaves us with two possible scenarios, either `filename` already contain an invalid format, or it does have the correct format but `wcout` treats it in a wrong way. Please check using visual studio and `_setmode(..., _O_U16TEXT)` right before using `wcout`, this would verify whether `wcout` is faulty – Moustafa El-Sayed Jan 12 '19 at 11:34
0

Here's an example that works with Windows. You'll have to find the right compiler/linker settings to support wmain on MinGW, but it will work. _setmode enables writing Unicode directly to the terminal, and should work as long as the font supports the characters. In my example I use some Chinese, which my font supports:

#include <Windows.h>
#include <iostream>
#include "fcntl.h"
#include "io.h"

int wmain(int argc, wchar_t* argv[])
{
    _setmode(_fileno(stdout), _O_U16TEXT);
    std::wcout << argv[0] << std::endl;

    wchar_t filename[MAX_PATH];
    GetModuleFileNameW(NULL,filename,MAX_PATH);
    std::wcout << filename << std::endl;

    return 0;
}

Output:

马克.exe
C:\üлicä\马克.exe
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
0

Why are you typedefing LPWSTR and PWSTR manually? windows.h already handles that for you.

In any case, as @n.m. said in comments, the arguments for main() are NOT encoded in UTF-8 on Windows, so converting non-ASCII characters using a UTF8->UTF16 converter will not produce the correct output. Use the Win32 MultiByteToWideChar() function instead to convert the arguments, using CP_ACP as the codepage to convert from. Or, use wmain() instead, which provides arguments as wchar_t* instead of as char*.

That will get you the data you want. Then, you just need to deal with the issue of Unicode output to the console. As other answers point out, the Windows console does not support UTF-16 output via std::wcout by default, so you have to jump through some additional hoops to make it work correctly (there are many other questions on StackOverflow about that issue).

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770