1

I'm trying to read in a .txt file (paragraph.txt as input) in C, and when I print the string it has garbage values on the end such as the paths to my visual studio, or junk data. This occurs 99% of the time with some exceptions. I have no idea why it is doing this?

void readfile(char **buffer, char *input)
{
    FILE *file = fopen(input, "r");
    int bytes = filesize(file);
    *buffer = (char*)malloc(bytes);
    fread(*buffer, bytes, 1, file);
    printf("%s\n", *buffer);
    fclose(file);
}

My filesize function just returns the number of bytes a file is, I've checked this and it is correct (4412 bytes is returned which is the exact number of characters I have). Function is called like:

readfile(&buffer, input);
John3136
  • 28,809
  • 4
  • 51
  • 69

1 Answers1

0

It's displaying "junk" because of two errors:

  1. you don't allocate enough memory for the content plus the '\0'-terminating byte.
  2. you don't set the '\0'-terminating byte. fread won't do that because fread doesn't care about the meaning of the bytes it's reading, it just reads a block of memory and that's it.

int readfile(char **buffer, const char *input)
{
    if(buffer == NULL)
        return 0;

    FILE *file = fopen(input, "r");
    if(file == NULL)
    {
        fprintf(stderr, "could not open %s for reading: %s\n", input,
                strerror(errno));
        return 0;
    }

    int bytes = filesize(file);

    *buffer = malloc(bytes + 1);
    if(*buffer == NULL)
    {
        fprintf(stderr, "Not enough memory\n");
        fclose(file);
        return 0;
    }

    if(fread(*buffer, bytes, 1, file) != 1)
    {
        fprintf(stderr, "could not read\n");
        free(*buffer);
        *buffer = NULL;
        fclose(file);
        return 0;
    }

    (*buffer)[bytes] = 0; // setting the 0 byte
    puts(*buffer);
    fclose(file);

    return 1;
}

You should also always check the return code of functions, specially those that return pointers and/or write to a pointer you passed them. If one of the functions returns an error, you would have undefined behaviour otherwise.

edit

in the comments you said

I tried this by doing *buffer[bytes-1] = '\0'; but it just crashed the program when it tried to do this

that's because buffer[bytes-1] is the same as buffer + bytes -1 and this beyond the bound of where buffer is pointing. When you try to dereference it with *, it will crash. You you need to do (like I did in the code) is: (*buffer)[bytes] = 0; which is the same as buffer[0][bytes] = 0.

Pablo
  • 13,271
  • 4
  • 39
  • 59
  • Unusually enough, it works, HOWEVER... when I do it a second time in a row the junk data comes back. – wrestlerdude Feb 20 '18 at 02:28
  • So the problem still persists. – wrestlerdude Feb 20 '18 at 02:30
  • Yes I have and it seems to add "Visual Studio 14.0\Com" to the end every time its ran after the initial time. – wrestlerdude Feb 20 '18 at 02:37
  • Seems like I'm out of luck then, thanks for trying anyway. – wrestlerdude Feb 20 '18 at 02:41
  • I called `readfile(&buffer, "/etc/fstab");` 10 times in a row with my code, I don't get the garbage at the end. You might have another undefined behaviour somewhere else, perhaps in `filesize`. – Pablo Feb 20 '18 at 02:42
  • Note that in windows `fopen(file, "r")` and `fopen(file, "rb")` behave differently, because windows has `\r\n` for newlines, so depending on how you calculate the number of bytes, you would get a wrong value, which may lead to an error in `fread`. Check your `filesize` function. Are you trying to read text files or binary files? – Pablo Feb 20 '18 at 02:45
  • This is strange behaviour. I recommend running your code with a debugger and see where it goes wrong. – Pablo Feb 20 '18 at 02:46
  • I am trying to read textfiles. To calculate the number of bytes I use fseek to go to the end of the file and then use ftell to get the number of bytes. I use rewind to go back to the start of the file. – wrestlerdude Feb 20 '18 at 02:50
  • @wrestlerdude Sadly I don't have windows, so I cannot check it there, in linux it works without a problem. The second time you execute `readfile` when you get garbage at the end, do you execute it with the same filename or with another filename? What happens when you execute twice with the same file name? Does you `filesize` do a `rewind`? See also https://stackoverflow.com/questions/238603/how-can-i-get-a-files-size-in-c – Pablo Feb 20 '18 at 03:17
  • I have found the root of my issue. For some reason my method of calculating the size of the file is incorrect because it gives a bigger file size than it should. I have used a different aglorithm and it now works. Although the null byte and the return checking is really useful and helped me reach a solution. – wrestlerdude Feb 20 '18 at 03:40