10

I need to open a file as std::fstream (or actually any other std::ostream) when file name is "Unicode" file name.

Under MSVC I have non-standard extension std::fstream::open(wchar_t const *,...)? What can I do with other compilers like GCC (most important) and probably Borland compiler.

I know that CRTL provides _wfopen but it gives C FILE * interface instead of io-streams, maybe there is a non-standard way to create io-stream from FILE *? Is there any boost::ifstream with MSVC like extension for Windows?

Smi
  • 13,850
  • 9
  • 56
  • 64
Artyom
  • 31,019
  • 21
  • 127
  • 215
  • You can't. On Mac OS X it was decided that the POSIX file api's - and hence std::fstream - would all take utf-8. In environments where there are not platform specific extentions (such as gcc and bc on windows (actually, they might have their own extensions, but that would be outside of the POSIX standard)) the c & c++ runtime functions cannot be expected to reliably access the filesystem given files with non ascii characters in their name. – Chris Becke Feb 23 '10 at 09:48
  • @Chris I have no problems with library that supports UTF-8 - it is perfect for me. The issue that Windows does not support UTF-8 – Artyom Feb 23 '10 at 10:42
  • a hackish workaround for MinGW is in http://stackoverflow.com/questions/6524821/opening-stream-via-function – marcin Mar 02 '15 at 19:03

3 Answers3

7

Unfortunately, there's no standard way to do that, although C++0x (1x?) promises to do that. Until then, you properly assumed that a solution can be found in Boost, however, the library you're searching for is Boost.Filesystem.

Boost.Filesystem internally uses wide strings by default for its universal path system, so there are no unicode problems in this regard.

Smi
  • 13,850
  • 9
  • 56
  • 64
Kornel Kisielewicz
  • 55,802
  • 15
  • 111
  • 149
  • Do you have a reference for C++0X? I remember discussions but no conclusion and I found nothing in the latest draft. – AProgrammer Feb 23 '10 at 09:30
  • 4
    The problem is the `boost::filesystem` does not supports wpath under MinGW/GCC because boost configuration defines `BOOST_NO_STD_WSTRING` for some reason (even wide strings work quite well under mingw) – Artyom Feb 23 '10 at 11:04
  • I've searched a little more and found this: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-closed.html#454, apparently no solution before library TR2. – AProgrammer Feb 23 '10 at 19:53
  • I think it is not good solution to add `wchar_t`. I think that MS CRTL just should support UTF-8 strings instead. Windows is only operating system that uses wide API for core parts. – Artyom Feb 24 '10 at 06:19
  • 5
    @art the issue is that windows supported unicode before utf8 was invented and wide was the best choice at that time. The die is now cast. – David Heffernan Jun 19 '11 at 07:16
3

Currently there is no easy solution.

You need to create your own stream buffer that uses _wfopen under the hood. You can use for this for example boost::iostream

Artyom
  • 31,019
  • 21
  • 127
  • 215
  • *_wfopen* is windows specific, guess boost resolves according to platform?! https://stackoverflow.com/a/35065142/9437799 – Sam Ginrich Apr 20 '22 at 08:15
-2

Convert the Unicode filename to a char* string using something like wcstombs() or WideCharToMultiByte() (which gives you far more control over the codepages involved).

Then use the converted filename to open the file.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 6
    File name includes unicode characters that may not be represented in current locale codepage (i.e. Hebrew characters can't be represented in Latin1 8-bit locale), also Windows does not support UTF-8 code pages. So no, this does not work. – Artyom Feb 23 '10 at 09:23
  • covers most of the practical use cases with special characters, where that MS-`MultiByte` actually has been ANSI – Sam Ginrich Jul 09 '22 at 09:48