0

I'm writing a C++ program that parses XML into JSON for a class and it works great when I compile in Visual Studio but behaves strangely when compiled with g++ in Linux.

With a bit of testing, I believe I have tracked the issue down into a difference in the way new lines are handled between the different compilers, here's some of the code I'm using to debug:

while (!fileToRead.eof()) { //Until we have reached the end of the file: ...
        cout << endl << "newloop: ";
        char c;
        fileToRead.get(c);
        cout << "read " << c << " ";

        if (c != '\n' && c != '\t') 
                cout << "is a text character.";
}

When I run an executable created in Visual Studio, it outputs the following for new line characters it reads:

newloop: read 

newloop: read 

newloop: read 

newloop: read 

When I run it on Linux when compiled with g++, it outputs the following for new line characters it reads:

 is a text character.
newloop: read

 is a text character.
newloop: read

 is a text character.
newloop: read

newloop: read

As you can see, when compiled with g++ there are 2 problems:

  • The third cout ("is a text character.") runs before the first and seconds couts ('"endl << << "newloop: "' and '"read " << c << "')
  • The if statement ("if (c != '\n' && c != '\t')") runs even when c is a new line character.

Can anyone explain what's going on here?

Mat
  • 202,337
  • 40
  • 393
  • 406
Aaron T
  • 1,092
  • 2
  • 10
  • 25
  • 1
    Windows also puts a carriage-return character in there, right before the linefeed. It's escape is `\r`. – Steve Feb 29 '16 at 19:46
  • 1
    not the problem but also see [Why is iostream::eof inside a loop condition considered wrong?](http://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong) – NathanOliver Feb 29 '16 at 19:47
  • @Steve How would *Windows* put that there? I'd say whatever piece of software (probably some text editor) he used to create that .xml file is responsible for the crappy `\r\n` newlines. – RocketNuts Feb 29 '16 at 19:54
  • @RocketNuts Okay, if you'd like to be pedantic, *Windows* didn't put them there, but on Windows, text files have those two characters as new-line (in general). – Steve Feb 29 '16 at 19:58

1 Answers1

0

You are not parsing the same XML file. Even if it looks the same in a text editor.

My guess is, the one on windows contains CRLF newlines (\r\n or 0x0D 0x0A), the other only contains LF (\n or 0x0A) newlines.

Make sure you have the exact same file on either system, and you'll get the same results.

RocketNuts
  • 9,958
  • 11
  • 47
  • 88
  • I am using FileZilla to move the xml files from my computer to the linux server. I am not editting them after moving them, so they should be the exact same files unless they were altered by FileZilla for some reason. – Aaron T Feb 29 '16 at 19:56
  • 1
    @AaronT I would write your software to be tolerant of either kind of new-lines, regardless. – Steve Feb 29 '16 at 19:59
  • My guess is the opposite: he probably *did* use the exact same file, which contains `\r\n`, leading to the problem. On Windows, the input stream translates `\r\n` to `\n`, but on Linux it doesn't. A simple solution is to modify `c != '\n' && c != '\t'` to include checking `'\r'`. (Perhaps using `std::isspace`). – M.M Feb 29 '16 at 20:06
  • Steve and M.M were right. I added this to the if statement "&& c != '\r' and now everything seems to be working well. Thanks! – Aaron T Feb 29 '16 at 20:28