0

@EDIT Looks like fread function reads more characters than record_size parameter ;x

PICTURE

I've got 2 functions which sort file(bubble sort) by records(key is first character). The first one is using system functions(read, write, etc.) and secong is using library functions(fread, fwrite, etc.). For small record_size parameter both works well but e.g for record_size = 5000 only sys_sort works properly. File sorted by lib_sort has less lines and different lengths. Why? I don't know what is the problem.

void lib_sort(const char *filename, long long int record_size, long long int num_of_lines) {
    record_size++;  // '\n' char at the end of line
    FILE *file;

    if (!(file = fopen(filename, "r+"))) {
        printf("Cannot open %s file.\n", filename);
        fclose(file);
        exit(EXIT_FAILURE);
    }

    char *buffer1 = malloc(sizeof(char) * record_size);
    char *buffer2 = malloc(sizeof(char) * record_size);

    bool flag = true;

    while (flag) {
        flag = false;
        if(fseek(file, 0, SEEK_SET) != 0) {
            printf("fseek failed.\n");
        }
        if((fread(buffer1, sizeof(char), (size_t) record_size, file)) != record_size) {
            printf("fread failed.\n");
        }

        for (int i = 1; i < num_of_lines; ++i) {
            if((fread(buffer2, sizeof(char), (size_t) record_size, file)) != record_size) {
                printf("fread failed.\n");
            }
            if (buffer1[0] > buffer2[0]) {
                if(fseek(file, record_size * (-2), SEEK_CUR) != 0) {
                    printf("fseek failed.\n");
                }
                if((fwrite(buffer2, sizeof(char), (size_t) record_size, file)) != record_size) {
                    printf("fwrite failed.\n");
                }

                if((fwrite(buffer1, sizeof(char), (size_t) record_size, file)) != record_size) {
                    printf("write failed.\n");
                }
                flag = true;
            } else {
                char *tmp = buffer2;
                buffer2 = buffer1;
                buffer1 = tmp;
            }
        }
        num_of_lines--;
    }
    fclose(file);
    free(buffer1);
    free(buffer2);
}

And this is the correct one:

void sys_sort(const char *filename, long long int record_size, long long int num_of_records) {
    record_size++;  // '\n' char at the end of line
    int file;

    if ((file = open(filename, O_RDWR)) < 0) {
        printf("Cannot open %s file.\n", filename);
        close(file);
        exit(EXIT_FAILURE);
    }

    char *buffer1 = malloc(sizeof(char) * record_size);
    char *buffer2 = malloc(sizeof(char) * record_size);

    bool flag = true;

    while (flag) {
        flag = false;
        lseek(file, 0, SEEK_SET);
        read(file, buffer1, (size_t) record_size);

        for (int i = 1; i < num_of_records; ++i) {
            read(file, buffer2, (size_t) record_size);
            if (buffer1[0] > buffer2[0]) {
                lseek(file, record_size * (-2), SEEK_CUR);
                write(file, buffer2, (size_t) record_size);
                write(file, buffer1, (size_t) record_size);
                flag = true;
            } else {
                char *tmp = buffer2;
                buffer2 = buffer1;
                buffer1 = tmp;
            }
        }
        num_of_records--;
    }
    close(file);
    free(buffer1);
    free(buffer2);
}

I use ubuntu 16.04 and standard C99

bednius
  • 1
  • 4
  • First thing I'd suggest is to check whether any of those system calls are failing and why. Something like `if( fread(...) != record_size ) { perror("fread failed to read sufficient data") }`. Same for `fseek`, and `fwrite`. – Schwern Mar 22 '17 at 18:40
  • Sorry, I can't reproduce your problem. I made a file with 5 lines each starting with a single number and then 4999 x's. Both worked fine. – Schwern Mar 22 '17 at 19:02
  • I'v just made tests and there are no printed errors. lib_sort for record_size = 5000 and num_or_record = 10 still doesn't work. – bednius Mar 22 '17 at 19:12
  • Edit your code with those tests, please. Can you give some more detail about what "doesn't work"? Also show how you're calling those functions. – Schwern Mar 22 '17 at 20:07
  • Result file just has less lines and differents lengths of lines(more and less than 5000). I found a bug that fread reads more characters than my record_size parameter and I have smth like this at the end of buffer1: jfuhaypwjgsgjxl\021\020 – bednius Mar 22 '17 at 20:38
  • You are using `fread()` and `fwrite()` incorrectly. When you call `fread(buffer, size, number of members, file stream pointer)` you are telling it that size of your record is `sizeof(char)` or `1` and that you are expecting it to read as many records as `record_size`. To `fread()` reading a few records, say 5000 out of requested 5010 bytes, is a success and will return 5000 that means it read 5000 records. In reality, what you wanted is to read only 1 record of `record_size` long. If you change to `fread(buffer2, (size_t) record_size, 1, file)` and do the same with `fwrite()`, it will work. – alvits Mar 23 '17 at 01:16

2 Answers2

0

You are using fread() and fwrite() incorrectly.

size_t fread(void *ptr, size_t size, size_t nmemb, FILE * stream );

size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);

Description

The function fread() reads nmemb elements of data, each size bytes long, from the stream pointed to by stream, storing them at the location given by ptr.

The function fwrite() writes nmemb elements of data, each size bytes long, to the stream pointed to by stream, obtaining them from the location given by ptr.

You are telling fread() and fwrite() that the length of your record or item is 1 byte long and you are asking for 5000 records.

Return Value

On success, fread() and fwrite() return the number of items read or written. This number equals the number of bytes transferred only when size is 1. If an error occurs, or the end of the file is reached, the return value is a short item count (or zero).

Thee parameters you pass to fread() and fwrites() are in the wrong order when in your code you clearly meant the record size (length of the item) is record_size or 5000 in your failing scenario.

You should have written your code to call fread() in this manner instead:

fread(buffer1, (size_t) record_size, 1, file)

and call fwrite() in this manner:

fwrite(buffer2, (size_t) record_size, 1, file)

It should also be noted that fread() and fwrite() works on binary streams. This means that strings are not automatically null terminated and reading will read past newlines and past null \0 bytes. On writes, newlines are not automatically converted to OS supported newlines such as LF on Linux and CRLF on Windows.

For string operations, use fgets() and fputs() instead.

You have forgotten to mention and I have wrongly assumed that you are running on Linux. Based on your comment, you are running on Windows after all. On Windows, fread() and fwrite() will not work correctly when the file is opened in text mode due to translations with OS dependent encoding. You will need to open the file in binary mode.

Community
  • 1
  • 1
alvits
  • 6,550
  • 1
  • 28
  • 28
  • Actually I've tried this before your answer [link](http://stackoverflow.com/questions/8589425/how-does-fread-really-work) and it still works incorrectly. I think that the problem is in fread and fwrite functions. I made a file and then I just called fread and fwrite functions. I compared loaded string with the line in file and it was the same. Then after fwrite function(write first record into second) they were different!(Usually about 4000 first characters are same and the rest is different). fwrite fails and always return record_size(or 1 in ur option of call). – bednius Mar 23 '17 at 02:33
  • You're interpreting the return value incorrectly. `fwrite()` will return the number of bytes written in your code and will return 1 in my suggestion. Both of those meant the write succeeded. but you incorrectly assuming those failed. – alvits Mar 23 '17 at 02:41
  • Ah you are running Windowze after all. Then this is your issue http://stackoverflow.com/questions/3187693/fread-ftell-apparently-broken-under-windows-works-fine-under-linux – alvits Mar 23 '17 at 02:42
  • @bednius - on windows you need to open the file in `binary` when using `fread()` and `fwrite()`. – alvits Mar 23 '17 at 02:43
  • I understood you and as I said I had tried this version before and it wasn't working. Program doesn't writes any errors(its means fread and fwrite succed but it didn't for real because file contect is different than expected) – bednius Mar 23 '17 at 02:46
  • It didn't work because you opened your file in text mode. You need to change `fopen(filename, "r+")` to `fopen(filename, "rb+")`. – alvits Mar 23 '17 at 02:48
  • I use ubuntu. Just wanted to try what happens on Windows – bednius Mar 23 '17 at 02:48
  • It's quite possible you are running out of stack and heap. Check if this will resolve the issue you are seeing http://stackoverflow.com/questions/13944841/how-to-increase-the-maximum-memory-allocated-on-the-stack-heap – alvits Mar 23 '17 at 23:50
  • I'v got the solution. I put `fseek(file, 0, SEEK_CUR);` after every `fread()` and `fwrite()` function. I have no idea why it works. – bednius Mar 24 '17 at 18:56
  • That is indeed odd. You would only need to `fseek()` between reads and writes if you are operating on a single file to synchronize the file pointers. In your case you are reading from 1 file and writing onto another file. – alvits Mar 24 '17 at 19:05
  • @bednius Your issue would seem to be similar to [this](http://stackoverflow.com/questions/26227564/file-i-o-cant-read-write-simultaneously/26227809#26227809). – alvits Mar 24 '17 at 19:11
  • @bednius - I suggest you post your own answer for future readers to see. And I also realized you are operating on a single `file`. I automatically assumed you were working on `file1` and `file2` as soon as I saw `buffer1` and `buffer2`. – alvits Mar 24 '17 at 19:17
  • Should I edit my first post or add new answer? And thanks for any help by the way :) – bednius Mar 24 '17 at 21:53
  • @bednius - do not edit your original question. Post the solution that worked, as an answer. – alvits Mar 24 '17 at 21:55
0

I put fseek(file, 0, SEEK_CUR); after every fread() and fwrite() functions and that worked for me. I don't know why.

bednius
  • 1
  • 4
  • This is the explanation to what you were seeing http://stackoverflow.com/questions/1713819/why-fseek-or-fflush-is-always-required-between-reading-and-writing-in-the-read-w – alvits Mar 24 '17 at 22:52