1

I'm working on an HTML form processor in C++, mainly as a learning experience. I have a little output buffer class to allow me to send the Content-Length header. It works fine until I try reading in and outputting a template file. It's on a Windows system, so the lines are of course terminated with \r\n, but when I use the length() method on my buffer string, it's not counting both characters, and my Content-Length ends up short. I tried reading the file both with and without ios::binary, and it makes no difference.

[EDIT]

OK, sorry, here is minimal code which reproduces the problem:

#include <iostream>
#include <fstream>
#include <sys/stat.h>

using namespace std;

size_t fileSize(const char* filename) {
    struct stat st;
    if(stat(filename, &st) != 0) return 0;
    return st.st_size;
}

int main() {
    char   fName[] = "testack.html";
    char   oName[] = "testout.txt";
    int   _size;
    char *_content;

    ifstream inFile;
    inFile.open(fName, ios::binary);
    if (inFile.good()) {
        _size = fileSize(fName);
        _content = new char[_size + 1];

        inFile.read(_content, _size);
        _content[_size] = 0;
    }

    ofstream os(oName);
    os << _content;

    return 0;
}

And here is the test file:

<HTML><BODY>Hello World!</BODY></HTML>

That is 38 bytes, and Windows and my program and everyone agrees, and I end up with 38 bytes in testout.txt Now, if I add a single line break:

<HTML>
<BODY>Hello World!</BODY></HTML>

Windows says it's 40 bytes (as I would expect), my program reads 40 bytes, and I end up with 41 bytes in the output file. With a second line break:

<HTML>
<BODY>
Hello World!</BODY></HTML>

Windows says 42 bytes, my program reads 42, and I end up with 44 in the output file. So, it appears that an extra byte is being added to each line break when I output it, whether to a file or to stdout. At this point I'm completely confused. Any ideas?

[EDIT]

And, with a little more testing I discovered that an extra \r is being added to each line, thus I have, for example:

<HTML>\r\r\n
alanlittle
  • 460
  • 2
  • 12
  • Using `ios::binary` *should* make a difference. – Bo Persson Mar 13 '17 at 13:36
  • 2
    Can you show how you read the file in binary mode? – NathanOliver Mar 13 '17 at 13:39
  • 3
    Present your [MCVE]. – Lightness Races in Orbit Mar 13 '17 at 13:43
  • @NathanOliver `inFile.open(fullname, ios::binary);` @BoundaryImposition You're right, sorry. Give me a minute. – alanlittle Mar 13 '17 at 13:46
  • 1
    That is how you open the file. How do you **read** the file? – NathanOliver Mar 13 '17 at 13:47
  • You are probably reading up-to but not including the end of line \r\n. Hard to know without minimal testable code – roalz Mar 13 '17 at 13:48
  • @BoundaryImposition OK, I did more testing, and I think I have the problem pretty well bracketed, but still no solution. Help! – alanlittle Mar 13 '17 at 16:52
  • 1
    You didn't open your output file in binary mode. – Lightness Races in Orbit Mar 13 '17 at 16:57
  • @BoundaryImposition Well, in practice, this will be going to `stdout`. I just have been outputting it to a text file to make it a little easier to test. Can I open `stdout` in binary mode? – alanlittle Mar 13 '17 at 17:01
  • "[What is the simplest way to write to stdout in binary mode?](http://stackoverflow.com/q/16888339/560648)" – Lightness Races in Orbit Mar 13 '17 at 17:02
  • @BoundaryImposition OK, I read that and some of the links there, and couldn't get any of those suggestions to work, but after a while it seemed to me that I was attacking the problem from the wrong end, and settled for: `#ifdef _WIN32` `buffer = regex_replace(buffer, regex("\r\n"), "\n");` `#endif` I don't know if this is a worthwhile question; it seems to me others might run into something similar. If so, I hope someone will post an answer so it doesn't get deleted by the Community bot. Thanks for your help. – alanlittle Mar 13 '17 at 21:02
  • Well let's get it working with the file stream first, _then_ talk about making it work with _stdout_. You need to open the file stream in binary mode. See whether that works, and you'll have narrowed down your problem significantly. – Lightness Races in Orbit Mar 14 '17 at 12:16
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/138025/discussion-between-alanlittle-and-boundaryimposition). – alanlittle Mar 14 '17 at 14:41
  • Nope, I've already provided quite a few hints. Good luck! – Lightness Races in Orbit Mar 14 '17 at 15:06

1 Answers1

1

Windows stdout in Binary Mode

As indicated by my edits and comments above, the problem was not at all with string.length(), but rather with Windows converting all \n to \r\n when sending to stdout. It even does this with existing \r\n sequences, turning them into \r\r\n. Thank you, Microsoft, for always knowing so much better than me what I really want to do.

My first solution, to convert all \r\n to \n before outputting (so that when Windows converted them back to \r\n the byte count would be correct) really was not an ideal solution, as it only addressed files being read and output, and anything output directly by the program was again causing the byte count to be off. Of course, I could have just appended \r\n to all my output (only to strip it and then have Windows put it back), but that seemed a bit...kludgey. After a good night's sleep and more thought and reading, I decided that forcing Windows to keep its hands off my bytes was the better solution -- to change stdout to binary mode.

However, the question that BoundaryImposition linked to did not have all the information I needed. So, after much googling and reading, here for posterity is the complete solution I settled on:

#if defined(_WIN32) || defined(_WIN64)
#include <io.h>
#include <fcntl.h>
#endif

int main() {
    #if defined(_WIN32) || defined(_WIN64)
    setmode(fileno(stdout), O_BINARY);
    #endif
}

Thank you to BoundaryImposition and to everyone else for your help and for continuing to beat me over the head with what I really needed to do, until it finally stuck.

Community
  • 1
  • 1
alanlittle
  • 460
  • 2
  • 12