-2

I am trying to read 7M data from a file but it is failing. When I googled,I found that there is no limit for reading data.

My code given below is failing with segmentation fault.

char *buf = malloc(7008991);
FILE *fp = fopen("35mb.txt", "rb");
long long i = 0;
long long j = 0;
while(fgets(buf+i, 1024, fp)) {
    i+=strlen(buf);
    if(i==7008991)break;
}
printf("read done");
printf("ch=%s\n", buf);

Need some help

Spikatrix
  • 20,225
  • 7
  • 37
  • 83
anand
  • 267
  • 1
  • 3
  • 16
  • Are you sure, that your `malloc` is not failing. – Dayal rai Jul 16 '14 at 11:43
  • upto how much memory malloc can allocate ? – anand Jul 16 '14 at 11:45
  • 2
    Are you sure to use fgets with binary mode? Also i (or buf+i) is wrongly incremented, and the exit condition is too slim to cover many actual cases. – Non-maskable Interrupt Jul 16 '14 at 11:45
  • can you please correct where i am doing wrong in code, please. – anand Jul 16 '14 at 11:46
  • can i create a buffer of size 35MB in c program ? – anand Jul 16 '14 at 11:49
  • 7
    Your title says 700k, the question says 70k, you malloc ~7MB, and the input file is named "35mb.txt". Are you sure about any of these numbers? – Blastfurnace Jul 16 '14 at 11:50
  • does `fopen` return `NULL`? You need to check for that! You use `malloc` but do not `free`. Plus you are allocating way too much memory. – H_squared Jul 16 '14 at 11:55
  • `fgets` with a `FILE *` that was opened in binary mode (_if_ the `fopen` call was successful)? I'd use `fread`... makes more sense to me to read N bytes, not N bytes or less (in case of new line). If that doesn't bother you, do it if only because `fgets` is short for File-GET-String as opposed to `fread` => File-READ. – Elias Van Ootegem Jul 16 '14 at 11:56
  • Sorry about the number... actually i am doing this since long so, please pardon for numbers :( – anand Jul 16 '14 at 11:59
  • @anand , Please edit the question and correct all the numbers.This would save a lot of confusion and might also save some down-votes! – Spikatrix Jul 16 '14 at 12:01
  • @blastfurnace that is related to just reading , but there is not mention of size, ie how much can be read at a time, so i dont think this is any way related to that question. please vote carefully – anand Jul 16 '14 at 12:07
  • @anand: Read the accepted answer. It shows one way to correctly read a file into a block of memory. What you have now is not good. Note that on any reasonably modern computer and OS 35MB is not a large file. – Blastfurnace Jul 16 '14 at 12:08
  • you try that with a file of 35mb text file, if that will work i will accept what ever you say. – anand Jul 16 '14 at 12:10
  • @anand: I don't know what hardware or OS you are using but on my modest Windows 8 64-bit desktop I can slurp several hundred megabytes into a single memory allocation with a single read operation. I do it with video files. – Blastfurnace Jul 16 '14 at 12:20
  • @anand: Just _try_ to use the code Blastfurnace suggested. I tested it with a 35Mb file, and it worked just fine. I've also tested a buffered read (1024 bytes at a time, reallocating memory as you go along and such) and that worked just as well. I've timed both approaches: read in one block ~0.015 sec for 35Mb. Buffered read: ~0.025 seconds. Read the whole file in one go, I'd say – Elias Van Ootegem Jul 16 '14 at 13:25
  • Where does it fail exactly ? Youd debugger will tell you. – Jabberwocky Jul 16 '14 at 14:46

3 Answers3

0

If you want to read the content of a large file into memory, you may: 1. actually read it 2. mmap it.

I'll cover how to actually read it, and assume using binary mode and no text-mode mess.

FILE* fp;
// Open the file
fp = fopen ("35mb.txt", "rb");
if ( fp == NULL ) return -1; // Fail

// Get file length, there are many use to do this like fstat
// TODO: check failure
fseek ( fp, 0, SEEK_END );
flen = ftell ( fp );
fseek ( fp, 0, SEEK_SET );

if ( fread ( buffer, flen, 1, fp ) != 1 ) {
  // Fail
}

fclose ( fp );
Non-maskable Interrupt
  • 3,841
  • 1
  • 19
  • 26
0

There are a few things that could go wrong here.

Firstly, no this line, memory allocation can fail. (Malloc can return a NULL pointer, you should check this. (You should also check that the file opened without error.)

char *buf = malloc(7008991);

Next, in the loop. Remember that fgets reads one line, regardless of how long that is, up to a maximum of 1024-1 bytes (and appends a null-character). Please not that for binary input, using fread is probably more appropriate.

while(fgets(buf+i, 1024, fp)) {

After that, this is a good line, as you really do not know how long a line is.

i+=strlen(buf); 

This line however is probably why you are failing.

if(i==7008991)break;

You are requireing the size to be exactly 77008991 bytes long to break. That is rather unlikely unless you are very very sure about the formatting of your file. This line should probably read if ( i >= 7008991 ) break;

You should probably replace your explicit size with a named constant as well.

Stian Svedenborg
  • 1,797
  • 11
  • 27
0

Most probably the size of your file is exactly 7008991 bytes. But when you read the file with fgets you ask to write at most 1024 bytes. This is not true when you reach the end of the file. Suppose you already read 7008990 bytes, then you should call fgets with: fgets(buf+i, 1, fp) because your buffer has got no more than one byte left.

Another issue is that you want to print the buffer at the end of your program. For this to work your buffer must be NUL terminated. So you need to allocate one more byte than the file size. fgets will automatically append the NUL byte.

Yet another issue is the way you increment your counter: i += strlen(buf) this is wrong, the correct code is: i = strlen(buf)

All of this assume there is no NUL bytes in your code. As already explained in comments, it is wiser to use fgets only when dealing with text files. When reading binary files you'd better use fread.

The corrected code would be:

unsigned long FILE_SIZE = 7008991+1;
char *buf = malloc(FILE_SIZE);
FILE *fp = fopen("35mb.txt", "rb");
long long i = 0;
long long j = 0;
while(fgets(buf+i, FILE_SIZE-i, fp)) {
    i = strlen(buf);
    if(i==7008991)break;
}
printf("read done");
printf("ch=%s\n", buf);
fjardon
  • 7,921
  • 22
  • 31