Continuing from my comments, you are thinking along the correct lines, you just haven't put the pieces together in the right order. In main()
you are looping calling your function, allocating for a single line, and then outputting a line of text and doing it over and over again. (and without freeing any of the memory you allocate -- creating a memory leak of every line read)
If you are allocating storage to hold the line, you will generally want to read the entire file in your function in a single pass though the file allocating for, and storing all lines, and returning a pointer to your collection of lines for printing in main()
(or whatever the calling function is).
You do that by adding one additional level of indirection and having your function return char **
. This is a simple two-step process where you:
allocate a block of memory containing pointers (one pointer for each line). Since you will not know how many lines beforehand, you simply allocate some number of pointers and then realloc()
more pointers when you run out;
for each line you read, you allocate length + 1
characters of storage and copy the current line to that block of memory assigning the address for the line to your next available pointer.
(you either keep track of the number of pointers and lines allocated, or provide an additional pointer set to NULL
as a Sentinel after the last pointer assigned a line -- up to you, simply keeping track with a counter is likely conceptually easier)
After reading your last line, you simply return the pointer to the collection of pointers which is assigned for use back in the caller. (you can also pass the address of a char **
as a parameter to your function, resulting in the type being char ***
, but being a Three-Star Programmer isn't always a compliment). However, there is nothing wrong with doing it that way, and in some cases, it will be required, but if you have an alternative, that is generally the preferred route.
So how would this work in practice?
Simply change your function return type to char **
and pass an additional pointer to a counting variable so you can update the value at that address with the number of lines read before you return from your function. E.g., you could do:
char **readfile (FILE *fp, size_t *n);
Which will take your file pointer to an open file stream and then read each line from it, allocating storage for the line and assigning the address for that allocation to one of your pointers. Within the function, you would use a sufficiently sized character array to hold each line you read with fgets()
. Trim the '\n'
from the end and get the length and then allocate length + 1
bytes to hold the line. Assign the address for the newly allocated block to a pointer and copy from your array to the newly allocated block.
(strdup()
can both allocate and copy, but it is not part of the standard library, it's POSIX -- though most compilers support it as an extension if you provide the proper options)
Below the readfile()
function puts it altogether, starting with a single pointer and reallocation twice the current number when you run out (that provides a reasonable trade-off between the number of allocations needed and the number of pointers. (after just 20 calls to realloc()
, you would have 1M pointers). You can choose any reallocation and growth scheme you like, but you want to avoid calling realloc()
for every line -- realloc()
is still a relatively expensive call.
#define MAXC 1024 /* if you need a constant, #define one (or more) */
#define NPTR 1 /* initial no. of pointers to allocate */
/* readfile reads all lines from fp, updating the value at the address
* provided by 'n'. On success returns pointer to allocated block of pointers
* with each of *n pointers holding the address of an allocated block of
* memory containing a line from the file. On allocation failure, the number
* of lines successfully read prior to failure is returned. Caller is
* responsible for freeing all memory when done with it.
*/
char **readfile (FILE *fp, size_t *n)
{
char buffer[MAXC], **lines; /* buffer to hold each line, pointer */
size_t allocated = NPTR, used = 0; /* allocated and used pointers */
lines = malloc (allocated * sizeof *lines); /* allocate initial pointer(s) */
if (lines == NULL) { /* validate EVERY allocation */
perror ("malloc-lines");
return NULL;
}
while (fgets (buffer, MAXC, fp)) { /* read each line from file */
size_t len; /* variable to hold line-length */
if (used == allocated) { /* is pointer reallocation needed */
/* always realloc to a temporary pointer to avoid memory leak if
* realloc fails returning NULL.
*/
void *tmp = realloc (lines, 2 * allocated * sizeof *lines);
if (!tmp) { /* validate EVERY reallocation */
perror ("realloc-lines");
break; /* lines before failure still good */
}
lines = tmp; /* assign reallocted block to lines */
allocated *= 2; /* update no. of allocated pointers */
}
buffer[(len = strcspn (buffer, "\n"))] = 0; /* trim \n, save length */
lines[used] = malloc (len + 1); /* allocate storage for line */
if (!lines[used]) { /* validate EVERY allocation */
perror ("malloc-lines[used]");
break;
}
memcpy (lines[used], buffer, len + 1); /* copy buffer to lines[used] */
used++; /* increment used no. of pointers */
}
*n = used; /* update value at address provided by n */
/* can do final realloc() here to resize exactly to used no. of pointers */
return lines; /* return pointer to allocated block of pointers */
}
In main()
, you simply pass your file pointer and the address of a size_t
variable and check the return before iterating through the pointers making whatever use of the line you need (they are simply printed below), e.g.
int main (int argc, char **argv) {
char **lines; /* pointer to allocated block of pointers and lines */
size_t n; /* number of lines read */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
lines = readfile (fp, &n);
if (fp != stdin) /* close file if not stdin */
fclose (fp);
if (!lines) { /* validate readfile() return */
fputs ("error: no lines read from file.\n", stderr);
return 1;
}
for (size_t i = 0; i < n; i++) { /* loop outputting all lines read */
puts (lines[i]);
free (lines[i]); /* don't forget to free lines */
}
free (lines); /* and free pointers */
return 0;
}
(note: don't forget to free the memory you allocated when you are done. That become critical when you are calling functions that allocate within other functions. In main()
, the memory will be automatically released on exit, but build good habits.)
Example Use/Output
$ ./bin/readfile_allocate dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
The program will read any file, no matter if it has 4-lines or 400,000 lines up to the physical limit of your system memory (adjust MAXC
if your lines are longer than 1023 characters).
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind
is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/readfile_allocate dat/captnjack.txt
==4801== Memcheck, a memory error detector
==4801== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4801== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4801== Command: ./bin/readfile_allocate dat/captnjack.txt
==4801==
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
==4801==
==4801== HEAP SUMMARY:
==4801== in use at exit: 0 bytes in 0 blocks
==4801== total heap usage: 10 allocs, 10 frees, 5,804 bytes allocated
==4801==
==4801== All heap blocks were freed -- no leaks are possible
==4801==
==4801== For counts of detected and suppressed errors, rerun with: -v
==4801== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
The full code used for the example simply includes the headers, but is included below for completeness:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
#define NPTR 1 /* initial no. of pointers to allocate */
/* readfile reads all lines from fp, updating the value at the address
* provided by 'n'. On success returns pointer to allocated block of pointers
* with each of *n pointers holding the address of an allocated block of
* memory containing a line from the file. On allocation failure, the number
* of lines successfully read prior to failure is returned. Caller is
* responsible for freeing all memory when done with it.
*/
char **readfile (FILE *fp, size_t *n)
{
char buffer[MAXC], **lines; /* buffer to hold each line, pointer */
size_t allocated = NPTR, used = 0; /* allocated and used pointers */
lines = malloc (allocated * sizeof *lines); /* allocate initial pointer(s) */
if (lines == NULL) { /* validate EVERY allocation */
perror ("malloc-lines");
return NULL;
}
while (fgets (buffer, MAXC, fp)) { /* read each line from file */
size_t len; /* variable to hold line-length */
if (used == allocated) { /* is pointer reallocation needed */
/* always realloc to a temporary pointer to avoid memory leak if
* realloc fails returning NULL.
*/
void *tmp = realloc (lines, 2 * allocated * sizeof *lines);
if (!tmp) { /* validate EVERY reallocation */
perror ("realloc-lines");
break; /* lines before failure still good */
}
lines = tmp; /* assign reallocted block to lines */
allocated *= 2; /* update no. of allocated pointers */
}
buffer[(len = strcspn (buffer, "\n"))] = 0; /* trim \n, save length */
lines[used] = malloc (len + 1); /* allocate storage for line */
if (!lines[used]) { /* validate EVERY allocation */
perror ("malloc-lines[used]");
break;
}
memcpy (lines[used], buffer, len + 1); /* copy buffer to lines[used] */
used++; /* increment used no. of pointers */
}
*n = used; /* update value at address provided by n */
/* can do final realloc() here to resize exactly to used no. of pointers */
return lines; /* return pointer to allocated block of pointers */
}
int main (int argc, char **argv) {
char **lines; /* pointer to allocated block of pointers and lines */
size_t n; /* number of lines read */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
lines = readfile (fp, &n);
if (fp != stdin) /* close file if not stdin */
fclose (fp);
if (!lines) { /* validate readfile() return */
fputs ("error: no lines read from file.\n", stderr);
return 1;
}
for (size_t i = 0; i < n; i++) { /* loop outputting all lines read */
puts (lines[i]);
free (lines[i]); /* don't forget to free lines */
}
free (lines); /* and free pointers */
return 0;
}
Let me know if you have further questions.