0

I'm stuck with an apparently harmless piece of code. I'm trying to read a whole flv video file into a uint8_t array, but by no reason only the 10 first bytes are read.

contents = malloc(size + 1);

if (read(fd, contents, size) < 0)
{
    free(contents);
    log_message(WARNING, __func__, EMSG_READFILE);
    return (NULL);
}

I've tried with fopen and "rb" also, but seems that Glibc ignores that extra 'b' or something. Any clues?

Thanks in advance.

Edit: Maybe it reads a EOF character?

PS. 'size' is a variable containing the actual file size using stat().

Manuel Abeledo
  • 327
  • 2
  • 4
  • 14

3 Answers3

3

It seems the original code correctly reads the entire content.

The problem seems to be in making use of that binary data - printing it out will truncate at the first null, making it appear that only 10 bytes are present. You can't use any methods intended for strings or character arrays to output binary data, as they will truncate at the first null byte, making it appear the array is shorter than it really is.

Check out some other questions related to viewing hex data: how do I print an unsigned char as hex in c++ using ostream? Converting binary data to printable hex

If you want to append this to a string - in what format? hex? base64? Raw bytes won't work.

Here's the original code I posted. A few minor improvements, plus some better diagnostic code:

int ret, size = 4096; /* Probably needs to be much bigger */
uint8_t *contents;
contents = malloc(size + 1);

if(contents == NULL)
{
    log_message(WARNING, __func__, EMSG_MEMORY);
    return (NULL);
}

ret = read(fd, contents, size);
if(ret < 0)
{
    /* Error reading file */
    free(contents);
    log_message(WARNING, __func__, EMSG_READFILE);
    return (NULL);
}

for(i = 0;i < ret;++i)
{
    printf("%c", contents[i]);
    /* printf("%0.2X", (char) contents[i]); /* Alternatively, print in hex */
}

Now, is ret really 10? Or do you just get 10 bytes when you try to print the output?

Community
  • 1
  • 1
jmanning2k
  • 9,297
  • 4
  • 31
  • 23
2

The 'read()' function in the C library doesn't necessarily return the whole read in one shot. In fact, if you're reading very much data at all, it usually doesn't give it to you in a single call.

The solution to this is to call read() in a loop, continuing to ask for more data until you've got it all, or until read returns an error, indicated by a negative return value, or end-of-file, indicated by a zero return value.

Something like the following (untested):

contents = malloc(size + 1);

bytesread = 0;
pos = 0;
while (pos < size && (bytesread = read(fd, contents + pos, size - pos)) > 0)
{
    pos += bytesread;
}

if (bytesread < 0)
{
    free(contents);
    log_message(WARNING, __func__, EMSG_READFILE);
    return (NULL);
}

/* Go on to use 'contents' now, since it's been filled.  Should probably 
   check that pos == size to make sure the file was the size you expected. */

Note that most C programmers would do this a little differently, probably making 'pos' a pointer which gets moved along, rather than offsetting from 'contents' each time through the loop. But I thought this approach might be clearer.

divegeek
  • 4,795
  • 2
  • 23
  • 28
  • +1 for calling 'read' in a loop. should check if malloc fails due. – Liran Orevi Oct 06 '09 at 22:40
  • Yes, the code should check if malloc fails. In practice, malloc virtually (hehe) never fails on Linux or other modern OSes, and if it does the program is just going to segfault as soon as the content pointer is dereferenced. Odds are if malloc does fail, your only option is to abort the program anyway. But, for correctness, the return value of malloc should be checked. – divegeek Oct 07 '09 at 01:00
0

On success, read() returns the number of bytes read (which may be less than what you asked for, at which point you should ask for the rest.) On EOF it will return 0 and on error it will return -1. There are some errors for which you might want to consider re-issuing the read (eg. EINTR which happens when you get a signal during a read.)

asveikau
  • 39,039
  • 2
  • 53
  • 68