0

I need to use a C++ program to create a new file with Unicode characters (for example, äöüé.txt) in both Windows and Linux with the following code:

int main(){
    std::string nameOfFile;

    std::cout << "Please enter the name of  file ! " << std::endl;

    std::cin >> nameOfFile;

    std::cout << "name = " << nameOfFile << std::endl;

    std::fstream mystream;

    mystream.open(nameOfFile, std::ios::out | std::ios::trunc | std::ios::binary);

    mystream.close();

    return 0;
}

I execute the same program both in Windows and Linux (with Visual Studio 2015 for Windows and gcc 5.4 for Linux), with the input "äöüé.txt" in the terminal.

I found that the file "äöüé.txt" is created correctly with the right file name "äöüé.txt" in Linux. But the file name created in Windows seems to be bad ("„”‚.txt").

I know that this is because of the encoding difference between Linux and Windows. Linux adopts UTF-8 while Windows adopts UTF-16.

Now my need is to create the file in Windows correctly, just as in Linux.

I have tried the following methods:

(1) according to std::wstring VS std::string, I tried to use Microsoft's MultiByteToWideChar() function as described in details here: Open utf8 encoded filename in c++ Windows, but FAIL:

#ifdef _MSC_VER
std::wstring ToUtf16(std::string str)
{
    std::wstring ret;
    int len = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), NULL, 0);
    if (len > 0)
    {
        ret.resize(len);
        MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), &ret[0], len);
    }
    return ret;
}
#endif

int main()
{
    std::string nameOfFile;

    std::cout << "Please enter the name of  file ! " << std::endl;

    std::cin >> nameOfFile;

    std::cout << "name = " << nameOfFile << std::endl;

    std::ifstream iFileStream(
        #ifdef _MSC_VER
        ToUtf16(nameOfFile).c_str()
        #else
        nameOfFile.c_str()
        #endif
        , std::ifstream::in | std::ifstream::binary);
    return 0;
}

(2) according to How to create a file with UNICODE path on Windows with C++, I tried to use the CreateFile() function, but FAIL:

int main()
{
    std::string nameOfFile;

    std::cout << "Please enter the name of  file ! " << std::endl;

    std::cin >> nameOfFile;

    std::cout << "name = " << nameOfFile << std::endl;

    /*convert string to char array */
    int stringLen = nameOfFile.length();
    char* text = new char[stringLen + 1];
    std::strcpy(text, nameOfFile.c_str());

    /*Convert to utf-16*/
    HANDLE hFile = CreateFileA(nameOfFile.c_str(),
        GENERIC_WRITE,
        0,
        NULL,
        CREATE_NEW,
        FILE_ATTRIBUTE_NORMAL,
        NULL);

    if (hFile != INVALID_HANDLE_VALUE) {
        int file_descriptor = _open_osfhandle((intptr_t)hFile, 0);

        if (file_descriptor != -1) {
            FILE* file = _fdopen(file_descriptor, "w");

            if (file != NULL) {
                std::ofstream stream(file);

                stream << "Hello World\n";

                // Closes stream, file, file_descriptor, and file_handle.
                stream.close();

                file = NULL;
                file_descriptor = -1;
                hFile = INVALID_HANDLE_VALUE;
            }
        }
    }

    return 0;
}

(3) according to https://en.cppreference.com/w/cpp/locale/codecvt_utf8_utf16 (see the example at bottom), I tried to use codecvt function and then use _wfopen() as described here: https://learn.microsoft.com/en-us/previous-versions/yeby3zcb(v%3Dvs.140), but FAIL.

My constraints are that:

  1. C++11 (I know that C++17 involve the filesystem in STL, so this problem can be resolved) as described here: How to open an std::fstream (ofstream or ifstream) with a unicode filename?

  2. boost is not allowed

  3. QT library is not allowed

The only things I can use is the C++ standard library and Microsoft library.

Do you have some ideas?

To Alan:

Thanks to your reply, i have used the following code to verify the encoding of character in my windows:

int main(){

    std::wstring nameOfFile;

    std::wcout << "Please enter the name of  file ! " << std::endl;

    std::wcin >> nameOfFile;

    std::wcout << "name = " << nameOfFile << std::endl;

    /*convert string to char array */
    int stringLen = nameOfFile.length();
    wchar_t* text = new wchar_t[stringLen + 1];
    std::wcscpy(text, nameOfFile.c_str());

    /*Get the coding number*/
    std::cout << "strlen(text)    : " << wcslen(text) << std::endl;

    std::cout << "text(ordinals)  :";

    for (size_t i = 0, iMax = wcslen(text); i < iMax; ++i)
    {
        std::cout << " " << static_cast<unsigned int>(
            static_cast<unsigned char>(text[i])
        );
    }

    _wfopen(text, L"w");

    return 0;
}

The code page of my Windows is 850, and the output shows that äöüé encode as 132 148 129 130, which, according to the table for code page 850, is exactly represent ä(132) ö'(148) ü(129) é(130).

At the end of the code above, I use the _wfopen() function to create a file, but the exact file created is still badly named.

By the way, the use of std::fstream(), as shown in my second example, can not create a new file, it can just read an existing file.

I think the fopen() or _wfopen() are the only functions which can create the new file instead of reading an existing file.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
Franck
  • 23
  • 3
  • you need to pass `std::wstring` to `open` on windows, you should probably also check that the string you read from the terminal is correct, unicode really doesn't work well in the windows terminal unless all the characters fit in the current code page – Alan Birtles Jan 09 '20 at 09:55
  • please also provide a [mre] of the code you have tried and didn't work – Alan Birtles Jan 09 '20 at 09:59
  • in your second example `utf8path` is presumably supposed to be `nameOfFile`, `nameOfFile` read from the console will almost certainly not be utf-8 encoded on windows. in your third example you are using `CreateFileA` not `CreateFileW` so will only be able to open ascii filenames – Alan Birtles Jan 09 '20 at 10:26
  • Thanks Alan, can you give me an simple example to show me how to create a file with the unicode name. – Franck Jan 09 '20 at 13:25
  • There are plenty of examples in the posts you've already linked to, I think your issue is probably not creating the file but reading the filename from the console – Alan Birtles Jan 09 '20 at 13:33
  • Yes just as you said that there are plenty of examples in the posts, my need is to create a file(not existing) according to the name which i input in the console. The majority of the posts are just telling me how to read an existing file which is proved from my parts works very well. – Franck Jan 09 '20 at 13:50
  • The windows console really doesn't support unicode very well, you might have some luck using `std::wcin` if the entered characters are part of your current code page – Alan Birtles Jan 09 '20 at 14:09

0 Answers0