1

I'm trying to get last line of file, using logic described here Fastest way to read only last line of text file?, but I'm getting some strange anomaly:

score.seekg(-2, ios::cur);

resets my stream to the same character, so I get infinite loop. However, setting it to -3 works perfectly:

fstream score("high_scores.txt"); //open file

if(score.is_open()) //file exist
{   
    score.seekg(0, ios::end);

    char tmp = '~';
    while(tmp != '\n')
    {
        score.seekg(-3, ios::cur);

        if((int)score.tellg() <= 0) //start of file is start of line
        {
            score.seekg(0);
            break;
        }
        tmp = score.get();
        cout << tmp << "-";
    }
}

Again, the problem is - this code works only with seekg() offset -3, when, theoretically, it should work with -2. Can this be explained somehow? The file contents are like this (newline at the end of file):

28 Mon Jul 10 16:11:24 2017
69 Mon Jul 10 16:11:47 2017
145 Mon Jul 10 16:53:09 2017

I'm using Windows, so now I understand why I need -3 offset from the end of file (to pass CR and LF bytes). But lets consider first char (from end).

28 Mon Jul 10 16:11:24 2017

So, stream gets to 7. It extracts it, and moves to CR byte. If, then, in next loop iteration we offset it -3, we will get 0, but not 1! But in reality, I'm getting 1! And all works fine with -3 offset. That is the mystery for me. Can't get it out of my head.

ScienceDiscoverer
  • 205
  • 1
  • 3
  • 13
  • Present a [MCVE] please. It's very hard to see where in your question the problem is demonstrated. – Lightness Races in Orbit Jul 10 '17 at 14:46
  • 2
    It's most likely a problem caused by line ending style - DOS vs Unix. – R Sahu Jul 10 '17 at 14:47
  • View your file in a hex editor or an editor than can show you the decimal or hexadecimal value of each character in the file. The alternative is to open the file in binary and display each character in hexadecimal or decimal. This will show you if the line endings are CR, LF (0x0d, 0x0a) or only LF (0x0a). – Thomas Matthews Jul 10 '17 at 14:57
  • @ThomasMatthews indeed, it has 0d 0a line endings. Windows specific thing? OK, but how this affects non-endline chars? Like stream got past 0a and 0d, got last char, moved +1 position. If now, we do -3 offset to current position, theoretically, we get not 2nd but 3rd char (from the end)! But we get 2nd... – ScienceDiscoverer Jul 10 '17 at 16:20
  • Each line ending character counts towards positioning. The -2 should position to the last letter, '7'. A -3 should position to the '1'. At least, that is my understanding. – Thomas Matthews Jul 10 '17 at 16:48

1 Answers1

2

I hope this illustrates what is happening:

28 Mon Jul 10 16:11:24 2017CL  <- C = CR, L = LF
                       6543210 <- position relative to ios::end
                        | || |
                        | || * Start after seekg(0, ios::end)
                        | *|   After first seekg(-3, ios::cur)
                        |  *   After first get()
                        *      After second seekg(-3, ios::cur)

When you seek to SEEK_END, you move the stream position pointer to the byte right past the end of the file. If you seek -3, you skip over the CR, LF, and end up on the '7'. You read this byte, but this moves the pointer one byte ahead. Then you go three back again, and you end up at the '0'.

Note that line endings in the file really are two bytes (CR and LF). It's just that when you read them, that they are converted to a single '\n'. However, when you seek it just uses byte offsets into the actual file. This is why people either recommend you just read the file from start to finish, or that you open the file in binary mode to remove this dichotomy.

G. Sliepen
  • 7,637
  • 1
  • 15
  • 31
  • Wait, you made little typo: its not seekg(-3, ios::end) its seekg(-3, ios::cur)! Theoretically, I should get 0 after second seekg, but anomaly happens. I actually get 1 after 7, then 0, 2, space, 4, 2, etc. Like everything working just fine. Isn't this strange? I can't explain this... It's like get() advances pointer by 2 after extracting char, not by 1. But this isn't possible? Right? – ScienceDiscoverer Jul 11 '17 at 11:08
  • Thanks, I fixed the typo! But if you get the `1` instead of the `0`, that is indeed very strange. `get()` shouldn't advance the pointer by two bytes for a normal `fstream`. – G. Sliepen Jul 12 '17 at 17:34