4

So im trying to read an entire text file using this function:

FILE *fp = fopen(path, "r");
fseek(fp, 0, SEEK_END);
int tamanioArchivo = sizeof(char) * ftell(fp);
fseek(fp, 0, SEEK_SET);
char* archivo = malloc(tamanioArchivo + 1);
fread(archivo, tamanioArchivo + 1, 1, fp);
//do something with archivo
fclose(fp);
free(archivo);

I debugged it and the problem seems to be on the fread line. It brings back the file and adds some garbage at the end. Any ideas what im doing wrong?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Marco
  • 1,112
  • 1
  • 14
  • 34

3 Answers3

5

Generally C doesn't care what the contents of a file are. Whether it's text or binary data, it's read the same way. Meaning if you want to read a string and get something nicely null-terminated, you need to handle that yourself.

fread(archivo, tamanioArchivo+1, 1, fp);

This reads one extra byte (again, null-termination is a C thing, the file system does not enforce this). Get rid of the plus 1. Then you must ensure that it's null-terminated:

archivo[tamanioArchivo] = '\0';
Max
  • 21,123
  • 5
  • 49
  • 71
  • 3
    Note that you must still _allocate_ one extra byte; that is, the `malloc` call is correct in the original code. – zwol Sep 15 '17 at 13:37
  • 1
    And how do you know if the `fread()` call worked? You're not checking the return value. – Andrew Henle Sep 15 '17 at 15:12
  • Likewise for `fopen` and `malloc`, but I was just addressing the original question. – Max Sep 15 '17 at 15:21
  • Im using the eclipse debugger that gives me on every instruction the value of return. @AndrewHenle – Marco Sep 15 '17 at 16:21
  • @Marco Good to hear. Did my answer solve your problem? – Max Sep 15 '17 at 16:37
  • @Max Yes! but i ended up implementing an mmap. – Marco Sep 16 '17 at 22:53
  • @Marco, It's nice that your debugger gives you the return values, but you should be checking those values in your code, not just looking at them in the debugger. – Makyen Sep 17 '17 at 00:29
2

The "garbage" comes from improperly examining/printing archivo and code not noting how much was read.

fread() saves into archivo data, but that data is not certainly null character terminated - a C string. Append a '\0' if archivo is to point to a string.

But where? After the last character read - which is determined by the return value of fread(), not after the number of characters requested.

Even without an error, the return value may be less than the number of requested characters due to text mode end-of-line translation of "\r\n" into "\n". The value returned from ftell() is not necessarily the numbers of characters that will be read in text mode.

No need to attempt to read tamanioArchivo+1 characters. Use tamanioArchivo.

Also see @ Andrew Henle good comment about fread(ptr, element_size, number_of_elements, stream) parameter order.

// Note that the file is opened in text mode, not binary mode
FILE *fp = fopen (path, "r");

fseek(fp, 0, SEEK_END);
// int tamanioArchivo = sizeof(char) * ftell(fp);
long tamanioArchivo = ftell(fp);
fseek(fp, 0, SEEK_SET); // or `rewind(fp);

// Good to allocate +1 for the null chracter
char* archivo = malloc(tamanioArchivo + 1u);

// Save the return value and drop the +1
// fread(archivo, tamanioArchivo+1, 1, fp);
size_t read_count = fread(archivo, 1, tamanioArchivo, fp);

// bad
// bytes read may be less than the offset due to CR/LF-->LF or input error
archivo[tamanioArchivo] = '\0';

// good
archivo[read_count] = '\0';

//do something with archivo
puts(archivo); 

fclose (fp);
free(archivo);

Robust code would check the return value of fopen(), malloc(), fseek(), ftell() for unusual return values. The seek to the end, ftell() method for determining file length has additional limitations. Alternatives depend on various unposted coding goals.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    `size_t read_count = fread(archivo, tamanioArchivo, 1, fp);`? Isn't that backwards? See http://pubs.opengroup.org/onlinepubs/9699919799/functions/fread.html You're asking `fread()` to read *one* item of size `tamanioArchivo` bytes, and since `fread()` returns the number of *items* successfully read, you'd get a return value of `1` on success. I'd think you want `size_t read_count = fread(archivo, 1, tamanioArchivo, fp);` for the return value to be a count of the number of bytes actually read. – Andrew Henle Sep 15 '17 at 17:27
  • @AndrewHenle Yes. Code amended. – chux - Reinstate Monica Sep 15 '17 at 17:36
1

Try this example:

/* Open Fpga config file */
fpconf = fopen(file, "rb");
if (fpconf == NULL) {
     return ERROR;
}

/* set SEEK_END to read the file size using ftell */
fseek(fpconf, 0L, SEEK_END);

bsize = ftell(fpconf);
print_dbg("Size of file: %d bytes\n", (int)bsize);

/* set file offset 0 */
fseek(fpconf, 0L, SEEK_SET);

data = (uint8_t *) malloc(sizeof(uint8_t) * bsize);
if (data == NULL) {
    print_err("Error in malloc\n");
    return ERROR;
}

/* Read data from file and store into buffer */
fread(data, bsize, 1, fpconf);

/* Close file */
fclose(fpconf);
Rajeshkumar
  • 739
  • 5
  • 16
  • 1
    `sizeof(uint8_t)` is 1 _by definition_, just like `sizeof(char)`; you should never write it. – zwol Sep 15 '17 at 13:09
  • yes. It is not required. – Rajeshkumar Sep 15 '17 at 13:15
  • The problem is the missing null terminator at the end of the file contents read into memory. You must allocated one extra byte and set the null terminator after the bytes read from the file. – chqrlie Sep 15 '17 at 15:19