10

The following program compiles in Visual Studio 2008 under Windows, both with Character Set "Use Unicode Character Set" and "Use Multi-Byte Character Set". However, it does not compile under Ubuntu 10.04.2 LTS 64-bit and GCC 4.4.3. I use Boost 1.46.1 under both environments.

#include <boost/filesystem/path.hpp>
#include <iostream>

int main() {
  boost::filesystem::path p(L"/test/test2");
  std::wcout << p.native() << std::endl;
  return 0;
}

The compile error under Linux is:

test.cpp:6: error: no match for ‘operator<<’ in ‘std::wcout << p.boost::filesystem3::path::native()’

It looks to me like boost::filesystem under Linux does not provide a wide character string in path::native(), despite boost::filesystem::path having been initialized with a wide string. Further, I'm guessing that this is because Linux defaults to UTF-8 and Windows to UTF-16.

So my first question is, how do I write a program that uses boost::filesystem and supports Unicode paths on both platforms?

Second question: When I run this program under Windows, it outputs:

/test/test2

My understanding is that the native() method should convert the path to the native format under Windows, which is using backslashes instead of forward slashes. Why is the string coming out in POSIX format?

Roger Dahl
  • 15,132
  • 8
  • 62
  • 82

3 Answers3

2

Your understanding of native is not completely correct:

Native pathname format: An implementation defined format. [Note: For POSIX-like operating systems, the native format is the same as the generic format. For Windows, the native format is similar to the generic format, but the directory-separator characters can be either slashes or backslashes. --end note]

from Reference

This is because Windows allows POSIX-style pathnames, so using native() won't cause problems with the above.

Because you might often get similar problems with your output I think the best way would be to use your preprocessor, i.e.:

#ifdef WINDOWS
std::wostream& console = std::wcout;
#elif POSIX
std::ostream& console = std::cout;
#endif

and something similar for the string-class.

filmor
  • 30,840
  • 6
  • 50
  • 48
  • Thank you for your answer. Looking at this some more, it seems that using cout and no wide characters is the way to go in Linux since everything is UTF-8 there. But in the Windows console, using wcout would, though it compiles, not display Unicode characters correctly. To get that working, there are apparently several hoops one must jump through. – Roger Dahl Mar 17 '11 at 05:40
  • 1
    Windows console is capable of displaying Unicode in UTF-16 format but its default font is not capable, you must change the console settings to a Unicode font. Also in my experience all routes through Windows's C I/O functions will stupidly pass through a non-Unicode 8-bit encoding bottleneck even if you pass UTF-16 and output is in UTF-16! You may be able to do some setup and pass some parameters to avoid it... but calling the "wide" Windows console output APIs is much easier. – hippietrail Aug 13 '12 at 08:31
1

Try this:

#include <boost/filesystem/path.hpp>
#include <iostream>

int main() {
  boost::filesystem::path p("/test/test2");
  std::wcout << p.normalize() << std::endl;
  return 0;
}
Denys Yurchenko
  • 333
  • 5
  • 10
1

If you want to use the wide output streams, you have to convert to a wide string:

#include <boost/filesystem/path.hpp>
#include <iostream>

int main() {
  boost::filesystem::path p(L"/test/test2");
  std::wcout << p.wstring() << std::endl;
  return 0;
}

Note that AFAIK using wcout doesn't give you Unicode output on Windows; you need to use wprintf instead.

Philipp
  • 48,066
  • 12
  • 84
  • 109
  • 1
    Thank you for the answer. I've done some research on wcout and it appears that it just converts wide characters to narrow characters by stripping out the high byte(s) before sending the resulting narrow characters to stdout. Apparently, this is how wcout is defined in the C++ Standard Library, but I haven't been able to verify this yet. If so, wcout really doesn't do anything useful on any platform. – Roger Dahl Mar 17 '11 at 05:15
  • path::string() is not functionally equivalent to path:native(). Wonder why there is no path::wnative()? – Roger Dahl Mar 17 '11 at 05:27
  • @Roger: "wcout really doesn't do anything useful on any platform." – I agree. To print Unicode strings in Windows, you have to use [`wprintf`](http://blogs.msdn.com/b/michkap/archive/2008/03/18/8306597.aspx). About `wnative`: By definition, there is exactly one canonical native format on each platform, which can be a wide (Windows) or narrow string (Unix). – Philipp Mar 17 '11 at 13:36