You wouldn't imagine something as basic as opening a file using the C++ standard library for a Windows application was tricky ... but it appears to be. By Unicode here I mean UTF-8, but I can convert to UTF-16 or whatever, the point is getting an ofstream instance from a Unicode filename. Before I hack up my own solution, is there a preferred route here ? Especially a cross-platform one ?
-
I this is a [duplicate](http://stackoverflow.com/questions/480849/windows-codepage-interactions-with-standard-c-c-filenames) question. See if any of the answers there can help. – Yorgos Pagles May 04 '09 at 20:37
-
Why don't you use data types like `std::wofstream`? Notice the **w**! – sergiol Nov 23 '16 at 18:53
7 Answers
The C++ standard library is not Unicode-aware. char
and wchar_t
are not required to be Unicode encodings.
On Windows, wchar_t
is UTF-16, but there's no direct support for UTF-8 filenames in the standard library (the char
datatype is not Unicode on Windows)
With MSVC (and thus the Microsoft STL), a constructor for filestreams is provided which takes a const wchar_t*
filename, allowing you to create the stream as:
wchar_t const name[] = L"filename.txt";
std::fstream file(name);
However, this overload is not specified by the C++11 standard (it only guarantees the presence of the char
based version). It is also not present on alternative STL implementations like GCC's libstdc++ for MinGW(-w64), as of version g++ 4.8.x.
Note that just like char
on Windows is not UTF8, on other OS'es wchar_t
may not be UTF16. So overall, this isn't likely to be portable. Opening a stream given a wchar_t
filename isn't defined according to the standard, and specifying the filename in char
s may be difficult because the encoding used by char varies between OS'es.

- 142,714
- 15
- 209
- 331

- 243,077
- 51
- 345
- 550
-
4A far more complete and up to date answer, including how to do this with g++, as well as other Windows API avenues, etc., is available in [a more recent thread](http://stackoverflow.com/a/23969243/464581). – Cheers and hth. - Alf May 31 '14 at 14:09
-
@MichalM: no. `wchar_t` is of course just a 16-bit wide character type, which can be used to store anything you like. It doesn't care about encodings. But the Win32 APIs which accept `wchar_t` arguments expect them to contain UTF-16 data. The Windows API hasn't used UCS-2 since Windows 2000, – jalf Nov 13 '15 at 15:03
-
@MichalM: What is *is* (not what it's close to, but what is *actually* stored in a `wchar_t`) is a UTF-16 *code unit*. It's not UCS-2, and while it is close to UCS-2, it is closer still to a UTF-16 code unit (because that's what it actually *is*). UTF-16 specifies a code point to be represented by one or two code units, the latter being known as a surrogate pair. – jalf Nov 18 '15 at 10:43
-
2
-
really? it's ofc present in minigw as of minigw is msvc copypaste – Алексей Неудачин Aug 02 '20 at 08:42
-
@jalf: you seem to know a lot about Unicode and C++. Do you offer a 2 or 3 day course where I could learn all about Unicode (ideally including Microsoft MBCS, Codepages and ANSI)? Or create a course on Pluralsight maybe (it doesn't have a unicode course)? – Thomas Weller Dec 01 '22 at 11:28
Since C++17, there is a cross-platform way to open an std::fstream with a Unicode filename using the std::filesystem::path overload. Example:
std::ofstream out(std::filesystem::path(u8"こんにちは"));
out << "hello";

- 3,053
- 3
- 24
- 33
-
1When I tried this on Windows the file created was named "ã“ã‚“ã«ã¡ã¯". (Source file saved as UTF-8). Are there other steps you have to perform to make this sample create a correct filename? – thomthom Oct 11 '20 at 20:41
-
@thomthom When using C++17, it should be `std::filesystem::u8path(u8"whatever")`. – fkorsa Sep 21 '22 at 13:08
The current versions of Visual C++ the std::basic_fstream have an open()
method that take a wchar_t* according to http://msdn.microsoft.com/en-us/library/4dx08bh4.aspx.

- 13,854
- 5
- 37
- 33
-
1
-
3Not all OSs and file systems support Unicode file names so it would not be portable. From what I can gather the wchar_t* open() and constructor on fstream are Microsoft extensions because NTFS does support Unicode file names. – John Downey May 04 '09 at 22:50
-
3or rather, because NTFS uses UTF16 to encode Unicode filenames. Linux supports unicode filenames too, but uses UTF8, so the regular char* version works there – jalf May 04 '09 at 23:12
-
3
Use std::wofstream
, std::wifstream
and std::wfstream
. They accept unicode filename. File name has to be wstring
, array of wchar_t
s, or it has to have _T()
macro, or prefix L
before the text.

- 464
- 7
- 12
-
2Could you provide evidence of `std::wfstream` being `Unicode`? Up to my modest knowledge, they just use `wchar_t` which is a wide character, usually `16-bits`. But the content could or not be `Unicode`. – Adrian Maire Apr 21 '17 at 11:53
-
What I meant is that they accept unicode strings, which answers the question, doesn't it? – Brackets Apr 21 '17 at 15:02
-
Actually it answer half of the question: let's say you got your file path UTF16 in your wfstream (or UTF8 in your fstream). Windows do not accept unicode and will return "wrong url" if you have some special characters(e.g. Chinese). – Adrian Maire Apr 21 '17 at 15:09
-
1How can windows not accept unicode? Are you talking about the first versions of windows? If anyone is still using them, they got bigger problems to solve. – Brackets Apr 21 '17 at 16:06
-
1You were probably right though. I just stumbled upon 2 cases where I had to write unicode characters with ofstreams and wofstream didn't help. I tried simple `file << L"фыв" << endl;`and not only it doesn't write it, it stops any further writing through the `file` stream. So I used winAPI's `WriteFile` instead. – Brackets Apr 25 '17 at 17:14
-
`wofstream` uses `wchar_t` as its unit of information. It doesn't provide a constructor taking any string type based on `wchar_t`. If your compiler offers it, it's a non-standard extension. If that compiler is the Microsoft Compiler, you could use `ofstream` just as well. The non-standard extension is available to any `basic_ofstream` class template instantiation. – IInspectable Jul 12 '19 at 20:52
Have a look at Boost.Nowide:
#include <boost/nowide/fstream.hpp>
#include <boost/nowide/cout.hpp>
using boost::nowide::ifstream;
using boost::nowide::cout;
// #include <fstream>
// #include <iostream>
// using std::ifstream;
// using std::cout;
#include <string>
int main() {
ifstream f("UTF-8 (e.g. ß).txt");
std::string line;
std::getline(f, line);
cout << "UTF-8 content: " << line;
}

- 2,379
- 1
- 30
- 40
-
nowide works very nicely.... shame it is not in the standard boost distribution; but getting it to work is pretty straightforward.... great to be able to sidestep wchar at last :) – Dec 15 '19 at 19:50
If you're using Qt mixed with std::ifstream
:
return std::wstring(reinterpret_cast<const wchar_t*>(qString.utf16()));
Note that the std::basic_ifstream
constructor normally doesn't accept a const w_char*
, but on in the MS implementation of STL it does. With other implementations you would probably call qString.utf8()
, and use the const char*
ctor.

- 5,189
- 3
- 37
- 63
-
There is no `ofstream` [constructor](https://en.cppreference.com/w/cpp/io/basic_ofstream/basic_ofstream) that takes a `std::wstring` argument. This appears to be an answer to a different question. – IInspectable Nov 19 '20 at 21:08
-
This is still inaccurate. The additional `basic_ifstream` constructors are a Microsoft-specific extension to their C++ library implementation. Other compilers for Windows may or may not provide those. Regardless, as [this answer](https://stackoverflow.com/a/54842261/1889329) explains you don't need to worry about character encodings, or which constructor to use at all. Just pass a [`filesystem::path`](https://en.cppreference.com/w/cpp/filesystem/path) and have it work on any OS. – IInspectable Nov 24 '20 at 15:45
-
@IInspectable Updated again. Not everyone can use C++17. My answer was intended to help people who use Qt. – Andreas Haferburg Nov 27 '20 at 10:54
-
If you cannot use C++17 or Microsoft's C++ Standard Library implementation then there aren't any safe alternatives. Passing a UTF-8 encoded string for any given platform is either not safe, or documented to not be safe, or documented to be safe. Consider yourself lucky if you are in the latter two categories, though realistically you'll find yourself in the first. In that case the recommendation to use `utf8()` is almost malevolent in that it frequently doesn't fail. – IInspectable Nov 27 '20 at 12:27
-
@IInspectable I don't know what you want from me. Is there anything you think needs to be done? Why are you using a word like "malevolent"? That's not a constructive way of working towards a common goal. – Andreas Haferburg Nov 28 '20 at 09:49
Use
wfstream
instead of
fstream
and
wofstream
instead of
ofstream
and so on... You can find this information in the iosfwd header file.

- 3,660
- 1
- 33
- 56