0

I run the code in both Windows and Linux. In Window, I can get result I intended but, in Linux, I get different result with one I get from Window.

What causes this difference and how to fix the code in Linux?

Thanks a lot! :) I attached the code, input, and result from both OS.

Below is my code; (This code is to reversely order the components with dots and differentiate components using a slash.)

#include <iostream>
#include <fstream>
#include <string>
#include <vector>


using namespace std;

string set_name = "6000k";

// in
string raw_file = set_name + "_filtered.txt";
// out
string set_file = set_name + "_filtered_dot.txt";

// main
int main()
{
    int i = 0;
    string comp = ""; 
    string str; 

    vector<string> input_comp;
    vector<string> tmp_comp; 
    int input_order = 0;

    ifstream infile;
    infile.open(raw_file);

    ofstream outfile;
    outfile.open(set_file);

    if (infile.fail()) // error handling
    {
        cout << "error; raw_file cannot be open..\n";
    }

    while (!infile.fail())
    {
        char c = infile.get();

        if (c == '\n' || c == '/')
        {
            if (comp != "") 
            {
                input_comp.push_back(comp);
            }

            int num = input_comp.size();
            for (int j = 0; j < num; j++)
            {
                int idx = (num - 1) - j;
                outfile << "/" << input_comp[idx];
            }

            if (c == '\n')
            {
                outfile << "/" << endl;
            }

            input_comp.clear();
            str = "";
            comp = "";
        }
        else if (c == '.')
        {
            if (comp != "") 
            {
                input_comp.push_back(comp);
            }

            comp = "";
        }
        else 
        {
            str = c;
            comp = comp + str;
        }

    }

    infile.close();
    outfile.close();

    return 0;
}

This is inputs in 'raw_file' declared in code;

/blog.sina.com.cn/mouzhongshao
/blogs.yahoo.co.jp/junkii3/11821140.html
/allplayboys.imgur.com

This is result from Window; (This is what I want to get from above code)

/cn/com/sina/blog/mouzhongshao/
/jp/co/yahoo/blogs/junkii3/html/11821140/
/com/imgur/allplayboys/

This is result from Linux; (unexpected result)

/cn/com/sina/blog/mouzhongshao
/
/jp/co/yahoo/blogs/junkii3/html
/11821140/
/com
/imgur/allplayboys/
user4581301
  • 33,082
  • 7
  • 33
  • 54
jjjhseo
  • 3
  • 2
  • 2
    `while (!infile.fail())` checks for failure before reading. Do not expect this to work. – user4581301 Aug 14 '17 at 22:04
  • The final value seems to include the newline on linux (e.g. `"html\n"` instead of just `"html"`) – Justin Aug 14 '17 at 22:04
  • If the input file was created on windows it will contain the windows end of line: \r\n. This will mess up your output under Linux because it will print the \r. – user4581301 Aug 14 '17 at 22:16
  • 1
    `\r ` vs `\n` vs `\r\n` vs `\n\r` vs other combinations. Beware your assumptions about file formats. – Jesper Juhl Aug 14 '17 at 22:25

1 Answers1

1

Windows uses a compound end of line: carriage return and line feed (\r\n). When a C++ file stream opened a file in text mode, the default, finds \r\n, it silently converts it to \n.

Linux only uses line feed (\n). When the file stream finds \r\n, the \r is treated like a regular character and passed to the parser.

So on Linux /blog.sina.com.cn/mouzhongshao\r\n is broken up into

<empty>
blog
sina
com
cn
mouzhongshao\r

And depending on how the console handles the \r may print

/cn/com/sina/blog/mouzhongshao
/

or

/cn/com/sina/blog/mouzhongshao

with the carriage return moving the cursor back to the beginning of the line and the overwriting the first / with the last.

The easy solution is to convert the input file to Linux-style line endings. Many Linux text editors have a DOS to Unix format conversion utility built in. A dos2unix application is also widely available. If all else fails, rewrite the file under Linux.

A longer solution is to make both Windows and Linux behave the same. Many examples of this already exist. Here is one: Getting std :: ifstream to handle LF, CR, and CRLF?

Also watch out for while (!infile.fail()) as it tests for readability before reading, meaning all subsequent reads may fail and you won't know. More on that here: Why is iostream::eof inside a loop condition considered wrong?

To resolve this, do not immediately cast the result of infile.get(); to a char Keep it an int long enough to see if the result is Traits::eof() before using the value as a char.

user4581301
  • 33,082
  • 7
  • 33
  • 54
  • I wrote the code in Window, that is why this difference is caused. I did not think of different line ending(\r\n and \n) between Window and Linux. Thanks for clear explanation! :) – jjjhseo Aug 14 '17 at 23:41