2

I am trying to read from file using fstream .The file I am trying to read has this content:

1200
1000
980
890
760

My code:

#include <fstream>
#include <iostream>

using namespace std;

int main ()
{
    fstream file("highscores.txt", ios::in | ios::out);

    if (!file.is_open())
    {
        cout << "Could not open file!" << endl;
        return 0;
    }

    int cur_score; 

    while (!file.eof())
    {
        file >> cur_score;
        cout << file.tellg() << endl;
    }
}

The output is:

9
14
18
22
26

Why after first read the tellg() returns 9, the first read is the number (1200) which is 4 positions and I know there is \r and \n so this make 6 positions. Also. if I add more number in my file tellg() will return a bigger number after first read.

Christophe
  • 68,716
  • 7
  • 72
  • 138
ehab
  • 43
  • 6
  • 2
    Don't use [`while(!file.eof())`](http://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong). – πάντα ῥεῖ May 30 '15 at 12:53
  • @πάνταῥεῖ: Exactly. Note that the OP has *five* input data, but *six* lines of output. Tsk Tsk. – Kerrek SB May 30 '15 at 12:57
  • There is no other white space, I made sure that after each number there are no spaces...I don't know if this info help but if you make it as csv it will work fine...the result of tellg() will be making more sense – ehab May 30 '15 at 12:59
  • @πάνταῥεῖ yes, thank you..I modified it, but still same behavior, the tellg() depends on the number of lines in the file to be read – ehab May 30 '15 at 13:02
  • 1
    [Cannot reproduce](https://ideone.com/a5dkA2), voting to close. – Kerrek SB May 30 '15 at 13:02
  • @πάνταῥεῖ I meant that after last number (760) there was a new line..so I edited the txt file and removed this new line – ehab May 30 '15 at 13:07
  • 1
    @KerrekSB the problem can be reproduced compiling with MingW under windows, and reading a CRLF file opened in text mode. – Christophe May 30 '15 at 15:27
  • @ehab: Please attach a hexdump of your input file. – Kerrek SB May 30 '15 at 16:41
  • @KerrekSB I hope this is what you asked for...I used a web site to convert my file to hexdump. 0000-0010: 31 32 30 30-0d 0a 31 30-30 30 0d 0a-39 38 30 0d 1200..10 00..980. 0000-0019: 0a 38 39 30-0d 0a 37 34-30 .890..74 0 – ehab Jun 02 '15 at 21:40

1 Answers1

2

If you've saved your file in UTF8 with a text editor, there might be an UTF8 BOM at the beginning of the file. This BOM is 3 chars long, so added to the 6, it would make 9.

If you want to be sure, check out the beginning of the file, with:

fstream file("highscores.txt", ios::in | ios::out | ios::binary);
if(file) {
    char verify[16];
    file.read(verify, sizeof(verify));
    int rd = file.gcount();
    for(int i = 0; i<rd; i++) {
        cout << hex << setw(2) << (int)verify[i] << " ";
    }
    cout <<dec << endl;
}

Edit:

Running on windows with MSVC2013 on the file and I found 4, 10, 15, 20, 25 as expected, and I couldn't reproduce your figures.

I've now done a test with mingw and here I get exactly your numbers, and the strange effect that increasing the number of lines increases the output.

THIS IS A BUG of MINGW when you read your windows (CRLF line separator) file in text mode:

  • If I save the file in UNIX style (i.e. LF line separator), I get with the same programme 4,9,13,17 which is again the expected value for a linux system.

  • If I save the file in WINDOWS style (i.e. CRLF line separator), and if I change the code to open the file in ios::binary, I get the awaited 4,10,15,20,25.

Apparently it's an old problem.

Christophe
  • 68,716
  • 7
  • 72
  • 138
  • but what about the dependency of the tellg() result on the number of lines in the txt file...I mean if I add one more number to the txt file I will get 10 instead of 9? – ehab May 30 '15 at 13:48
  • do you run exactly the code that you've posted in the question ? On which OS do you do this ? And what are the results if you use my verifying code above ? – Christophe May 30 '15 at 14:19
  • @ehabibrahim do you add the line to the file with a text editor ? Wich one ? – Christophe May 30 '15 at 14:23
  • I use win7 and run the exact code...the result of your code is this: 31 32 30 30 d a 31 30 30 30 d a 39 38 30 d – ehab May 30 '15 at 14:27
  • Ok ! I've reproduced it now, using migw under windows. – Christophe May 30 '15 at 15:10
  • I agree with you...I will use MSVC instead of codeblocks...thank you very much – ehab May 30 '15 at 15:32
  • @Christophe: Is the BOM treated like whitespace by the extraction operation? Or does the locale decode it as "no characters"? – Kerrek SB May 30 '15 at 15:35
  • @KerrekSB it should be interpreted as a [whitespace](https://en.wikipedia.org/wiki/Word_joiner) if used with the appropriate locale – Christophe May 30 '15 at 15:42
  • @Christophe I am sorry I am new here...how to close this question and consider it answered? or just leave it like this? – ehab May 30 '15 at 16:00
  • @ehabibrahim It's ok like this: it's considered as answered and it will help other people having similar issues. – Christophe May 30 '15 at 16:05
  • @KerrekSB the BOM is the utf8 encoding of U+FEFF and `std::isspace()` returns true for it with an UTF8 locale. Addendum for windows: reading an fstream, requires to specify that header shall be consumed: `locale loc(std::locale(), new std::codecvt_utf8 )`. With wfstreams you just need and codecvt_utf8 to have the BOM interpreted as space. – Christophe May 30 '15 at 16:34