0

I am trying to perform a read() from a file of which I don't know it's exact size into a variable so that I can do stuff on it later on, so I am looping like this:

char buf[BUFSIZE];
char* contentsOfFile;

fd = open(file, O_RDONLY);

while ( (nbytes = read(fd, buf, sizeof(buf)) ) > 0) { // keep reading until the end of file or error 
    strcat(contentsOfFile, buf);
}

Of course, this explodes unless contentsOfFile is another char array, but I cannot do this as I could have a bigger file than the number of bytes it could hold.

Is there any other library solution, or should I resort to malloc?

Lightsong
  • 312
  • 2
  • 8
  • 1
    `strcat()` stops when (if) it finds a `0` and you then overwrite the rest of the buffer in the next loop. Or you overflow the destination when `strcat()` does not find a `0`. – Weather Vane Dec 05 '20 at 19:34
  • 1
    And `char* contentsOfFile;` has no memory allocated. – Weather Vane Dec 05 '20 at 19:36
  • 1
    Why not find the size of the file and read it all at once into an appropriately sized buffer? Or just `mmap()` it? – Shawn Dec 05 '20 at 19:42

2 Answers2

1

Use malloc. Find the size first (How do you determine the size of a file in C?) then malloc the appropriate number of bytes and do the read.

Vercingatorix
  • 1,838
  • 1
  • 13
  • 22
  • 1
    or if the file very large map it into the address space. (nmap or CreateFileA in windows) – 0___________ Dec 05 '20 at 19:55
  • @eewanco Thanks! I hadn't remembered to use `stat`, that's a handy solution. I'm still learning C and system calls so I will check `mmap` in the (near) future – Lightsong Dec 05 '20 at 19:59
1

This is terrible code:

  • contentsOfFile is an unitialized pointer, so dereferencing it invokes UB
  • read returns raw bytes and never adds any terminating null (unformatted io), but strcat expects null terminated strings.

Without more context, it is hard to tell you what is the correct way. Possible ways are:

  • use mmap to map the file content into memory. After that, you can process it transparently and the OS will load and unload pages from the file when required
  • load everything into memory using malloc and realloc to make sure to have enough allocated memory for next read
  • load everything into memory using one single malloc and one single read after finding the file size.
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252