For clarity:
This is NOT a duplicate of Getting std :: ifstream to handle LF, CR, and CRLF?
This IS an extension of C++ cutting off character(s) when read lines from file
I state this up front because when I posted the question at C++ cutting off character(s) when read lines from file it was tagged as a potential duplicate of Getting std :: ifstream to handle LF, CR, and CRLF?. I tried a simplified version (direct read instead of buffers to keep it simple) of the proposed solution at the other post it did not work for me and even though I edited my question and the code to demonstrate that, there has been no responses. Jonathon suggested I re-post as a separate question so here I am.
I have also tried numerous other solutions, ending up with the code below but although the code handles tabs and normal text as expected, it is still not handling the newline character differences as was expected so I need help.
I want to:
- read in the contents of a txt file
- run some validation checks on the content
- output a report to another txt file
In this prototype code I am just reading in text from one file and outputting edited text to another file. After I get this working, I'll then worry about running validation tests, ...
I am compiling and testing on a Linux Mint Maya (Ubuntu 12.04 based) box and then cross-compiling with mingw32 to run on a Windows PC.
Everything works fine when I:
- Compile and run on a linux box with a linux-created text file
- Cross-compile on linux and run on Windows with a linux-created text file
However, when I:
- Cross-compile on linux and run on Windows with a Windows-created text file
the result is not as expected; the first few characters are skipped.
I need the program to handle either Windows-created or linux-created text files.
The (silly content for now just as a test) input file I am using in all cases (one created on linux box; one created on Windows using Notepad) is :
A new beginning
just in case
the file was corrupted
and the darn program was working fine ...
at least it was on linux
When I read the file in and use the program (code shown below) the linux-created text file produces the proper output:
Line 1: A new beginning
Line 2: just in case
Line 3: the file was corrupted
Line 4: and the darn program was working fine ...
Line 5: at least it was on linux
When I use the Windows-created text file and run the program on a Windows PC, the output is:
Line 1: A new beginning
Line 2: t in case
Line 3: e file was corrupted
Line 4: nd the darn program was working fine ...
Line 5: at least it was on linux
As you can see, there are characters missing from lines 2,3,4 but not from 1,5:
- 0 characters missing from the start of line 1
- 3 characters missing from the start of line 2
- 2 characters missing from the start of line 3
- 1 characters missing from the start of line 4
- 0 characters missing from the start of line 5
I expect this has something to do with the differences in handling of newline in linux and Windows text files but I have read the other postings on this and tried the solutions but it does not seem to be solving the issue. I am sure I am missing something very basic and apologize in advance if so, but I've been banging away at this for over a week and need help.
The code I am using is:
int main(int argc, char** argv)
{
/*
*Program to:
* 1) read from a text file
* 2) do some validation checks on the content of that text file
* 3) output a report to another text file
*/
std::string rc_input_file_name = "rc_input_file.txt";
std::string rc_output_file_name = "rc_output_file.txt";
char *RC_INPUT_FILE_NAME = new char[ rc_input_file_name.length() + 1 ];
strcpy( RC_INPUT_FILE_NAME, rc_input_file_name.c_str() );
char *RC_OUTPUT_FILE_NAME = new char[ rc_output_file_name.length() + 1 ];
strcpy( RC_OUTPUT_FILE_NAME, rc_output_file_name.c_str() );
std::ifstream rc_input_file_holder;
rc_input_file_holder.open( RC_INPUT_FILE_NAME , std::ios::in );
if ( ! rc_input_file_holder.is_open() )
{
std::cout << "Error - Could not open the input file" << std::endl;
return EXIT_FAILURE;
}
else
{
std::ofstream rc_output_file_holder;
rc_output_file_holder.open( RC_OUTPUT_FILE_NAME , std::ios::out | std::ios::trunc );
if ( ! rc_output_file_holder.is_open() )
{
std::cout << "Error - Could not open or create the output file" << std::endl;
return EXIT_FAILURE;
}
else
{
std::streampos char_num = 0;
long int line_num = 0;
long int starting_char_pos = 0;
std::string file_line = "";
while ( getline( rc_input_file_holder , file_line ) )
{
line_num = line_num + 1;
long unsigned file_line_length = file_line.length();
std::string string_to_find = "\r";
std::string string_to_insert = "\n";
long unsigned num_char_in_string_to_find = string_to_find.length();
long unsigned character_position;
while ( ( character_position = file_line.find( string_to_find ) ) != std::string::npos )
{
if ( character_position == file_line_length - num_char_in_string_to_find )
{
// If the \r character is found at the end of the line,
// it is the old Mac style newline,
// so replace it with \n
file_line.replace( character_position , num_char_in_string_to_find , string_to_insert );
file_line_length = file_line.length();
}
else
{
// If the \r character is found but is not the last character in the line
// it could be the second-last character meaning it is a Windows newline pair \r\n
// or it could be somewhere in the middle of the line
// so delete it
file_line.erase( character_position , num_char_in_string_to_find );
file_line_length = file_line.length();
}
}
int field_display_width = 4;
rc_output_file_holder << "Line " << line_num << ": " << file_line << std::endl;
starting_char_pos = rc_input_file_holder.tellg();
}
rc_input_file_holder.close();
rc_output_file_holder.close();
delete [] RC_INPUT_FILE_NAME;
RC_INPUT_FILE_NAME = 0;
delete [] RC_OUTPUT_FILE_NAME;
RC_OUTPUT_FILE_NAME = 0;
}
}
}
Any and all suggestions appreciated ...