0

I am in the same situation as fopen with unicode filename, where I want to use a library that uses fopen. The library in question is minizip, and I need it to work with UTF-8 encoding, on windows, UNIX (OSX), iOS and Android. I read the answers, including the discussion about GetShortPathName, and essentially the conclusion is to rewrite the library. Is there any way around this? Also, I read that the fopen function on UNIX system can handle UTF-8 encoding (unlike its windows counterpart). Can anyone confirm this?

I would really hate to have to go around and place some ifdef WINDOWS around everywhere in minizip... Does anyone have an alternative?

Community
  • 1
  • 1
David Menard
  • 2,261
  • 3
  • 43
  • 67

2 Answers2

2

Under Windows use _wfopen for Unicode projects. Note that it accepts Unicode (UTF-16) strings, not UTF-8. For UTF-8 the standard fopen has extra option ccs: FILE *fp = fopen("newfile.txt", "rt+, ccs=encoding");

i486
  • 6,491
  • 4
  • 24
  • 41
  • From what I understand, the encoding option for fopen applies to the content of the file, not the filename. Am I wrong? – David Menard Apr 06 '16 at 18:43
  • @DavidMenard Not sure about `ccs` - maybe you are right. `_wfopen` is for Unicode filenames. – i486 Apr 06 '16 at 18:44
0

The situation is a bit better in nowadays, you should not modify sources of Minizip, but still will be required one #ifdef in your code for support Unicode in file paths in Windows. Latest versions has new API functions for read/write ZIP:

  • unzOpen2_64(const void path, zlib_filefunc64_def pzlib_filefunc_def)
  • zipOpen2_64(const void pathname, int append, zipcharpc globalcomment, zlib_filefunc64_def* pzlib_filefunc_def)

The parameter zlib_filefunc64_def may refer to structure with set of functions for work with file system (or memory). Additionally, Minizip package supplies header file 'iowin32.h' that have methods for fill zlib_filefunc64_def structure using Windows API. So, for open file for reading you can use code like below:

zipFile openZipFile(const std::string& utf8FilePath)
{
    zipFile hZipFile = nullptr;
    #ifdef WIN32
        zlib_filefunc64_def ffunc;
        fill_win32_filefunc64W(&ffunc);

        // Convert path from UTF-8 to Unicode
        const int count = MultiByteToWideChar(CP_UTF8, 0, utf8FilePath.c_str(), static_cast<int>(utf8FilePath.size()), NULL, 0);
        std::wstring unicodePath(count, 0);
        MultiByteToWideChar(CP_UTF8, 0, utf8FilePath.c_str(), static_cast<int>(utf8FilePath.size()), &unicodePath[0], count);

        hZipFile = zipOpen2_64(unicodePath.c_str(), 0, NULL, &ffunc);
    #else
        // Unix systems handles path in the UTF-8 by default
        hZipFile = zipOpen64(utf8FilePath.c_str(), 0);
    #endif

    return hZipFile;
}

The second worked option that I've found - just set the locale to UTF-8. Better to do it at the application initialization layer, as it's not thread safe:

std::setlocale(LC_ALL, "en_us.utf8");

It looks much simpler, but I think that first variant is better and reliable as it doesn't depend to locale settings.

Pavel K.
  • 983
  • 9
  • 21