23

I am writing code that runs in Windows and outputs a text file that later becomes the input to a program in Linux. This program behaves incorrectly when given files that have newlines that are CR+LF rather than just LF.

I know that I can use tools like dos2unix, but I'd like to skip the extra step. Is it possible to get a C++ program in Windows to use the Linux newline instead of the Windows one?

thornate
  • 4,902
  • 9
  • 39
  • 43

3 Answers3

35

Yes, you have to open the file in "binary" mode to stop the newline translation.

How you do it depends on how you are opening the file.

Using fopen:

FILE* outfile = fopen( "filename", "wb" );

Using ofstream:

std::ofstream outfile( "filename", std::ios_base::binary | std::ios_base::out );
CB Bailey
  • 755,051
  • 104
  • 632
  • 656
  • 1
    Agreed. Open the stream as binary, and no translation takes place. the output of either '\n' or std::endl to such a stream results in a line-feed only. – Clifford Oct 08 '09 at 08:39
  • 2
    The `std::ios_base::out` flag doesn't seem to be necessary since it's implied by the `o` in `ofstream`. From the [docs](http://www.cplusplus.com/reference/fstream/ofstream/ofstream): "`out` is always set for `ofstream` objects (even if explicitly not set in argument mode)". – Gumby The Green Jul 27 '19 at 22:27
-1

OK, so this is probably not what you want to hear, but here's my $0.02 based on my experience with this:

If you need to pass data between different platforms, in the long run you're probably better off using a format that doesn't care what line breaks look like. If it's text files, users will sometimes mess with them. If by messing the line endings up they cause your application to fail, this is going to be a support intensive application.

Been there, done that, switched to XML. Made the support guys a lot happier.

sbi
  • 219,715
  • 46
  • 258
  • 445
-1

A much cleaner solution is to use the ASCII escape sequence for the LF character (decimal 10): '\012' or '\x0A' represents an explicit single line feed regardless of platform. Note that this at least on some compilers does not work; for example, on MSVC 2019 16.11.6, both '\012' and '\x0A' get translated to carriage return and line feed. It also does not matter there whether a string literal ("\012") or a char literal ('\012') is used.

This method also avoids string length surprises, as '\n' can expand to two characters. But so can multibyte unicode characters, in UTF8, when written directly into a string literal in the source code.

Note also that '\r' is the platform-independent code for a single carriage return (decimal 13). The '\f' character is not the line feed, but rather the form feed (decimal 12), which is not a newline on any platform I am aware of. C does not offer a single-character backslash escape for the line feed, thus the need for the longer octal or hexadecimal escapes.

codeling
  • 11,056
  • 4
  • 42
  • 71
  • according to my experiments with VS 2019 (16.11.6), at least on MSVC, also `\x0A` and `\012` both get translated to carriage return and line feed on Windows if the ofstream is opened in default "text mode" – codeling Nov 11 '21 at 11:39
  • I have added an according caveat to your answer. Can you comment on which compiler this actually works? – codeling Nov 11 '21 at 11:49