12

I have seen many posts but didn't find something like i want.
I am getting wrong output :

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ......  // may be this is EOF character

Going into infinite loop.

My algorithm:

  1. Go to end of file.
  2. decrease position of pointer by 1 and read character by character.
  3. exit if we found our 10 lines or we reach beginning of file.
  4. now i will scan the full file till EOF and print them //not implemented in code.

code:

#include<iostream>
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#include<string.h>

using namespace std;
int main()
{
    FILE *f1=fopen("input.txt","r");
    FILE *f2=fopen("output.txt","w");
    int i,j,pos;
        int count=0;
        char ch;
        int begin=ftell(f1);
        // GO TO END OF FILE
        fseek(f1,0,SEEK_END);
        int end = ftell(f1);
        pos=ftell(f1);

        while(count<10)
        {
            pos=ftell(f1);
            // FILE IS LESS THAN 10 LINES
            if(pos<begin)
                break;
            ch=fgetc(f1);
            if(ch=='\n')
                count++;
            fputc(ch,f2);
            fseek(f1,pos-1,end);
        }
    return 0;
}

UPD 1:

changed code: it has just 1 error now - if input has lines like

3enil
2enil
1enil

it prints 10 lines only

line1
line2
line3ÿine1
line2
line3ÿine1
line2
line3ÿine1
line2
line3ÿine1
line2

PS:
1. working on windows in notepad++

  1. this is not homework

  2. also i want to do it without using any more memory or use of STL.

  3. i am practicing to improve my basic knowledge so please don't post about any functions (like tail -5 tc.)

please help to improve my code.

Not a real meerkat
  • 5,604
  • 1
  • 24
  • 55
Aseem Goyal
  • 2,683
  • 3
  • 31
  • 48

8 Answers8

9

Comments in the code

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    FILE *in, *out;
    int count = 0;
    long int pos;
    char s[100];

    in = fopen("input.txt", "r");
    /* always check return of fopen */
    if (in == NULL) {
        perror("fopen");
        exit(EXIT_FAILURE);
    }
    out = fopen("output.txt", "w");
    if (out == NULL) {
        perror("fopen");
        exit(EXIT_FAILURE);
    }
    fseek(in, 0, SEEK_END);
    pos = ftell(in);
    /* Don't write each char on output.txt, just search for '\n' */
    while (pos) {
        fseek(in, --pos, SEEK_SET); /* seek from begin */
        if (fgetc(in) == '\n') {
            if (count++ == 10) break;
        }
    }
    /* Write line by line, is faster than fputc for each char */
    while (fgets(s, sizeof(s), in) != NULL) {
        fprintf(out, "%s", s);
    }
    fclose(in);
    fclose(out);
    return 0;
}
David Ranieri
  • 39,972
  • 7
  • 52
  • 94
  • yours is good implementation but i want to know will it work (pos) if i have very very large no. of lines and characters.say file around 20-30 GB – Aseem Goyal Jul 26 '13 at 10:46
  • Seems that you are on Windows and 20-30 GBs can be a problem, use `_fseeki64` and `_ftelli64` that support longer file offsets even on 32 bit Windows – David Ranieri Jul 26 '13 at 10:53
8

There are a number of problems with your code. The most important one is that you never check that any of the functions succeeded. And saving the results an ftell in an int isn't a very good idea either. Then there's the test pos < begin; this can only occur if there was an error. And the fact that you're putting the results of fgetc in a char (which results in a loss of information). And the fact that the first read you do is at the end of file, so will fail (and once a stream enters an error state, it stays there). And the fact that you can't reliably do arithmetic on the values returned by ftell (except under Unix) if the file was opened in text mode.

Oh, and there is no "EOF character"; 'ÿ' is a perfectly valid character (0xFF in Latin-1). Once you assign the return value of fgetc to a char, you've lost any possibility to test for end of file.

I might add that reading backwards one character at a time is extremely inefficient. The usual solution would be to allocate a sufficiently large buffer, then count the '\n' in it.

EDIT:

Just a quick bit of code to give the idea:

std::string
getLastLines( std::string const& filename, int lineCount )
{
    size_t const granularity = 100 * lineCount;
    std::ifstream source( filename.c_str(), std::ios_base::binary );
    source.seekg( 0, std::ios_base::end );
    size_t size = static_cast<size_t>( source.tellg() );
    std::vector<char> buffer;
    int newlineCount = 0;
    while ( source 
            && buffer.size() != size
            && newlineCount < lineCount ) {
        buffer.resize( std::min( buffer.size() + granularity, size ) );
        source.seekg( -static_cast<std::streamoff>( buffer.size() ),
                      std::ios_base::end );
        source.read( buffer.data(), buffer.size() );
        newlineCount = std::count( buffer.begin(), buffer.end(), '\n');
    }
    std::vector<char>::iterator start = buffer.begin();
    while ( newlineCount > lineCount ) {
        start = std::find( start, buffer.end(), '\n' ) + 1;
        -- newlineCount;
    }
    std::vector<char>::iterator end = remove( start, buffer.end(), '\r' );
    return std::string( start, end );
}

This is a bit weak in the error handling; in particular, you probably want to distinguish the between the inability to open a file and any other errors. (No other errors should occur, but you never know.)

Also, this is purely Windows, and it supposes that the actual file contains pure text, and doesn't contain any '\r' that aren't part of a CRLF. (For Unix, just drop the next to the last line.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • actually i am just coding it for practice and i want to read a file backwards. it is not for any efficiency purpose.its just for getting confident to handle files. and i have learnt a lot i was unaware of.thankyou. – Aseem Goyal Jul 26 '13 at 10:42
  • 1
    Well, the important thing is to always check for errors before using the results of reading (which I _didn't_ do in my example code), that `fgetc` (and `istream::get()`) return an `int`, not a `char` in order to return an out of band EOF (which is also used for errors), and the fact that any error conditions are sticky: if you see an error, you have to clear it before any further operations on the stream can work. And if you're working in C++, you'd be much better off learning iostream, as it is far more flexible and safer. – James Kanze Jul 26 '13 at 10:54
  • as you pointed "ftell in an int isn't a very good idea either". pos=ftell(pos); if(pos<0) break; is not working. so what can be done in c. i will try c++ afterwards. – Aseem Goyal Jul 26 '13 at 11:28
  • `ftell` in an int isn't a good idea, because the function returns a `long`, and putting it in an `int` can overflow. But that's not your problem if you are testing with a small file. The other thing is the value of that `long`: under Windows (at least on a binary file) and Unix, it is the number of bytes from the beginning of the file; unless there is an error, it can never be less than zero. – James Kanze Jul 26 '13 at 12:31
  • Nor less than `begin`, in your code; you cannot position before the beginning of the file. – James Kanze Jul 26 '13 at 12:34
  • And one additional point: you cannot read a file open in text mode backwards under Windows. It just isn't possible. Make sure that you open the file in binary mode, and then handle the `'\r'` in the CRLF sequence manually. (And be aware that most things you will do which involve seeking are implementation dependent, and will behave differently under Windows and under Unix.) – James Kanze Jul 26 '13 at 12:39
4

This can be done using circular array very efficiently. No additional buffer is required.

void printlast_n_lines(char* fileName, int n){

    const int k = n;
    ifstream file(fileName);
    string l[k];
    int size = 0 ;

    while(file.good()){
        getline(file, l[size%k]); //this is just circular array
        cout << l[size%k] << '\n';
        size++;
    }

    //start of circular array & size of it 
    int start = size > k ? (size%k) : 0 ; //this get the start of last k lines 
    int count = min(k, size); // no of lines to print

    for(int i = 0; i< count ; i++){
        cout << l[(start+i)%k] << '\n' ; // start from in between and print from start due to remainder till all counts are covered
    }
}

Please provide feedback.

ColinDave
  • 490
  • 1
  • 6
  • 16
igauravsehrawat
  • 3,696
  • 3
  • 33
  • 46
1

I believe, you are using fseek wrong. Check man fseek on the Google.

Try this:

fseek(f1, -2, SEEK_CUR);
//1 to neutrialize change from fgect
//and 1 to move backward

Also you should set position at the beginning to the last element:

fseek(f1, -1, SEEK_END).

You don't need end variable.

You should check return values of all functions (fgetc, fseek and ftell). It is good practise. I don't know if this code will work with empty files or sth similar.

Ari
  • 3,101
  • 2
  • 27
  • 49
  • $ man fseek 'man' is not recognized as an internal or external command, operable program or batch file. – default Jul 26 '13 at 10:51
  • @Default use linux or Internet – Ari Jul 26 '13 at 11:37
  • @Ari The original poster stated explicitly that he was under Windows. (And even under Unix, I would recommend going to the Posix standard, rather than `man`, if you're interested in portability. Although a lot of man pages _will_ specify what is standard, and what is extension.) – James Kanze Jul 26 '13 at 12:36
  • @JamesKanze My mistake. Updated changing man source from local to Google. – Ari Jul 26 '13 at 15:48
1
int end = ftell(f1);
pos=ftell(f1);

this tells you the last point at file, so EOF. When you read, you get the EOF error, and the ppointer wants to move 1 space forward...

So, i recomend decreasing the current position by one. Or put the fseek(f1, -2,SEEK_CUR) at the beginning of the while loop to make up for the fread by 1 point and go 1 point back...

DaMachk
  • 643
  • 4
  • 10
0

Use :fseek(f1,-2,SEEK_CUR);to back

I write this code ,It can work ,you can try:

#include "stdio.h"

int main()
{
        int count = 0;
        char * fileName = "count.c";
        char * outFileName = "out11.txt";
        FILE * fpIn;
        FILE * fpOut;
        if((fpIn = fopen(fileName,"r")) == NULL )
                printf(" file %s open error\n",fileName);
        if((fpOut = fopen(outFileName,"w")) == NULL )
                printf(" file %s open error\n",outFileName);
        fseek(fpIn,0,SEEK_END);
        while(count < 10)
        {
                fseek(fpIn,-2,SEEK_CUR);
                if(ftell(fpIn)<0L)
                        break;
                char now = fgetc(fpIn);
                printf("%c",now);
                fputc(now,fpOut);
                if(now == '\n')
                        ++count;
        }
        fclose(fpIn);
        fclose(fpOut);
}
Lidong Guo
  • 2,817
  • 2
  • 19
  • 31
0

I would use two streams to print last n lines of the file: This runs in O(lines) runtime and O(lines) space.

#include<bits/stdc++.h>
using namespace std;

int main(){
  // read last n lines of a file
  ifstream f("file.in");
  ifstream g("file.in");

  // move f stream n lines down.
  int n;
  cin >> n;
  string line;
  for(int i=0; i<k; ++i) getline(f,line);

  // move f and g stream at the same pace.
  for(; getline(f,line); ){
    getline(g, line);
  }

  // g now has to go the last n lines.
  for(; getline(g,line); )
    cout << line << endl;
}

A solution with a O(lines) runtime and O(N) space is using a queue:

ifstream fin("file.in");
int k;
cin >> k;
queue<string> Q;
string line;
for(; getline(fin, line); ){
  if(Q.size() == k){
    Q.pop();
  }
  Q.push(line);
}
while(!Q.empty()){
  cout << Q.front() << endl;
  Q.pop();
}
shisui
  • 195
  • 1
  • 7
0

Here is the solution in C++.

#include <iostream>                                                             
#include <string>                                                               
#include <exception>                                                            
#include <cstdlib>                                                              

int main(int argc, char *argv[])                                                
{                                                                               
    auto& file = std::cin;                                                      

    int n = 5;                                                                  
    if (argc > 1) {                                                             
        try {                                                                   
            n = std::stoi(argv[1]);                                             
        } catch (std::exception& e) {                                           
            std::cout << "Error: argument must be an int" << std::endl;         
            std::exit(EXIT_FAILURE);                                            
        }                                                                       
    }                                                                           

    file.seekg(0, file.end);                                                    

    n = n + 1; // Add one so the loop stops at the newline above                
    while (file.tellg() != 0 && n) {                                            
        file.seekg(-1, file.cur);                                               
        if (file.peek() == '\n')                                                
            n--;                                                                
    }                                                                           

    if (file.peek() == '\n') // If we stop in the middle we will be at a newline
        file.seekg(1, file.cur);                                                

    std::string line;                                                           
    while (std::getline(file, line))                                            
        std::cout << line << std::endl;                                         

    std::exit(EXIT_SUCCESS);                                                    
} 

Build:

$ g++ <SOURCE_NAME> -o last_n_lines

Run:

$ ./last_n_lines 10 < <SOME_FILE>
Steven Eckhoff
  • 992
  • 9
  • 18