0

Trying to create 1Mb(1048576Byte) file by writing in various chunk sizes and a different number of threads. When int NUM_THREADS = 2 or int NUM_THREADS = 1 then created file size is same as given i.e. 10MB .

However when I increase thread count to 4, The created file size is around 400MB; Why this anomaly?

#include <pthread.h>
#include <string>
#include <iostream>
#define TenGBtoByte 1048576
#define fileToWrite "/tmp/schatterjee.txt"

using namespace std;
pthread_mutex_t mutexsum;
struct workDetails {
    int threadcount;
    int chunkSize;
    char *data;
};

void *SPWork(void *threadarg) {
    struct workDetails *thisWork;
    thisWork = (struct workDetails *) threadarg;
    int threadcount = thisWork->threadcount;
    int chunkSize = thisWork->chunkSize;
    char *data = thisWork->data;
    long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
    FILE *f = fopen(fileToWrite, "a+");
    for (long i = 0; i < noOfWrites; ++i) {
        pthread_mutex_lock(&mutexsum);
        fprintf(f, "%s", data);
        fflush (f);
        pthread_mutex_unlock(&mutexsum);
    }
    fclose(f);
    pthread_exit((void *) NULL);
}

int main(int argc, char *argv[]) {
    int blocksize[] = {1024};
    int NUM_THREADS = 2;
    for (int BLOCKSIZE: blocksize) {
        char *data = new char[BLOCKSIZE];
        fill_n(data, BLOCKSIZE, 'x');

        pthread_t thread[NUM_THREADS];
        workDetails detail[NUM_THREADS];
        pthread_attr_t attr;
        int rc;
        long threadNo;
        void *status;

        /* Initialize and set thread detached attribute */
        pthread_mutex_init(&mutexsum, NULL);
        pthread_attr_init(&attr);
        pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
        for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
            detail[threadNo].threadcount = NUM_THREADS;
            detail[threadNo].chunkSize = BLOCKSIZE;
            detail[threadNo].data = data;
            rc = pthread_create(&thread[threadNo], &attr, SPWork, (void *) &detail[threadNo]);
            if (rc) exit(-1);
        }
        pthread_attr_destroy(&attr);
        for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
            rc = pthread_join(thread[threadNo], &status);
            if (rc) exit(-1);
        }
        pthread_mutex_destroy(&mutexsum);
        delete[] data;
    }
    pthread_exit(NULL);
}

N.B. - 1)It's a benchmarking task, so doing as they asked in requirement. 2) long noOfWrites = (TenGBtoByte / (threadcount * chunkSize)); basically computing how many times each thread should write to get the combined size of 10MB. 4)I tried to put Mutex lock at various position . All yeild in same result

Suggestions about other changes in the programme is also welcome

Grant Miller
  • 27,532
  • 16
  • 147
  • 165
sapy
  • 8,952
  • 7
  • 49
  • 60
  • Don't tag C++ questions with the C tag — you'll make people unhappy with you. If you're using C++ headers, use a C++ tag (and not a C tag). – Jonathan Leffler Mar 13 '18 at 04:46
  • 2
    Because the mutex is locked for pretty much the entire thread you wind up serializing the bulk of the work. – user4581301 Mar 13 '18 at 04:46
  • I tried different position for mutex locks . – sapy Mar 13 '18 at 04:48
  • 1
    Look at the specification of [`flockfile()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/flockfile.html) (and `funlockfile()` on the same page). Read that specification carefully; it impacts every other function that uses an explicit or implicit `FILE *` value. Rethink your use of the mutex because each thread keeps the mutex locked while it is running. That might be necessary, but it makes using threads pointless. You could do the same job without using `pthread_create()`. (Also, except for the headers, you might as well be coding in C — but you should still decide which!) – Jonathan Leffler Mar 13 '18 at 04:50
  • Doesn't work ..Which is preety much what I expected . If mutex itself can't tame the loop. File lock wont . Problem is somewhere deep .. ```pthread_mutex_lock(&mutexsum); flockfile(f); fprintf(f, "%s", data); fflush (f); funlockfile(f); pthread_mutex_unlock(&mutexsum);``` – sapy Mar 13 '18 at 05:02
  • One question per Question. – Jive Dadson Mar 13 '18 at 05:13
  • @JiveDadson Done. – sapy Mar 13 '18 at 05:15
  • Quoting the fopen man page: "Opening a file with append mode (a as the first character in the mode argument) shall cause all subsequent writes to the file to be forced to the then current end-of-file, regardless of intervening calls to fseek()." With this information, think on where each thread going to write data If you have multiple threads all with the same file open for appending. – user4581301 Mar 13 '18 at 05:16
  • Then actual file size should be less than 10MB right how come it's 4 times ? And how come when thread count is 2 it's exact size ? This is a working program you can try in your machine and see. – sapy Mar 13 '18 at 05:17
  • To figure out why the file is bigger than it should be, I would start by counting and printing out the number of block writes executed. with `TenGBtoByte` of 1048576 and `BLOCKSIZE` of 1024, there should be a total of 1024 writes. If you get more, you're on your way. – user4581301 Mar 13 '18 at 05:24
  • Thats the first thing I tried when block size is 1024 & num of thread is 4 . Total count is (1 to 256 ) * 4 . i.e. 1024. But file size is 40 times :) . This looks like a deep problem with file io / unflushed buffer etc.. – sapy Mar 13 '18 at 05:34
  • Is there any info that will help you answer the question ? Please let me know . – sapy Mar 13 '18 at 05:57

2 Answers2

2

You are allocating and initializing your data array like this:

char *data = new char[BLOCKSIZE];
fill_n(data, BLOCKSIZE, 'x');

Then you are writing it to file using fprintf:

fprintf(f, "%s", data);

Function fprintf expects data to be a null-terminated string. This is an undefined behavior already. If this worked with low number of threads, it is because memory after than memory chunk happen to contain zero byte.

Other than that, mutex in your program serves no purpose and can be removed. File locking is also redundant, so you can use fwrite_unlocked and fflush_unlocked to write your data since every thread uses separate FILE object. Essentially all synchronization in your program is performed in the kernel, not in userspace.

Even after removing mutex and using _unlocked functions your program reliably creates 1 MB files regardless of number of threads. So invalid file writing seems to be the only issue you have.

  • Without mutex file size is less than original. So mutex is required. (Try removing mutex and you will see) – sapy Mar 13 '18 at 21:18
  • @sapy I tried and it worked. You are opening file in append mode, if it works correctly all threads should ignore current position in the file and append data to the end of the file. Unless append mode does not work for you - might be because filesystem does not support it. This is possible with NFS. Should work without mutex on Linux with EXT4. –  Mar 13 '18 at 21:34
  • @sapy You should keep a mutex there for portability reasons, my point is that you should not observe a race on a Linux because it ignores file pointer for files opened with `O_APPEND` (which is what libc usually uses when it takes `a` mode. According to [this](https://linux.die.net/man/2/pwrite) Linux is also non-conforming when it comes to `pwrite` - see "bugs" section. I am not sure, but it might be required that file writing for C files uses `write` and cannot use `pwrite` in which case according to POSIX it should always append data, not rewrite it. –  Mar 20 '18 at 09:04
0

@Ivan Yes! Yes! Yes! .You are absolutely right my friend. Except for a small fact. The mutex is necessary. This is the final code. Try removing mutex and file size will be different.

#include <pthread.h>
#include <string>
#include <iostream>
#define TenGBtoByte 1048576
#define fileToWrite "/tmp/schatterjee.txt"

using namespace std;
pthread_mutex_t mutexsum;;
struct workDetails {
    int threadcount;
    int chunkSize;
    char *data;
};

void *SPWork(void *threadarg) {

    struct workDetails *thisWork;
    thisWork = (struct workDetails *) threadarg;
    int threadcount = thisWork->threadcount;
    int chunkSize = thisWork->chunkSize;
    char *data = thisWork->data;
    long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
    FILE *f = fopen(fileToWrite, "a+");

    for (long i = 0; i < noOfWrites; ++i) {
        pthread_mutex_lock(&mutexsum);
        fprintf(f, "%s", data);
        fflush (f);
        pthread_mutex_unlock(&mutexsum);
    }
    fclose(f);
    pthread_exit((void *) NULL);
}

int main(int argc, char *argv[]) {
    int blocksize[] = {1024};
    int NUM_THREADS = 128;
    for (int BLOCKSIZE: blocksize) {
        char *data = new char[BLOCKSIZE+1];
        fill_n(data, BLOCKSIZE, 'x');
        data[BLOCKSIZE] = NULL;

        pthread_t thread[NUM_THREADS];
        workDetails detail[NUM_THREADS];
        pthread_attr_t attr;
        int rc;
        long threadNo;
        void *status;
        pthread_mutex_init(&mutexsum, NULL);
        pthread_attr_init(&attr);
        pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
        for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
            detail[threadNo].threadcount = NUM_THREADS;
            detail[threadNo].chunkSize = BLOCKSIZE;
            detail[threadNo].data = data;
            rc = pthread_create(&thread[threadNo], &attr, SPWork, (void *) &detail[threadNo]);
            if (rc) exit(-1);
        }
        pthread_attr_destroy(&attr);
        for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
            rc = pthread_join(thread[threadNo], &status);
            if (rc) exit(-1);
        }
        pthread_mutex_destroy(&mutexsum);
        delete[] data;
    }
    pthread_exit(NULL);
} 
sapy
  • 8,952
  • 7
  • 49
  • 60
  • This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - [From Review](/review/low-quality-posts/19099567) – Gilles Gouaillardet Mar 14 '18 at 00:21
  • It does provide an answer as I have put the working code that fixes the issue. And I have referred to another answer without quoting to it. – sapy Mar 14 '18 at 02:11
  • why did you accept an incorrect answer then ? this code is your initial question plus two minor changes ... keep in mind a serial version will run faster than this multithreaded one, which is suboptimal on so many levels I do not know where to start. – Gilles Gouaillardet Mar 14 '18 at 02:41
  • The answer was not completely incorrect. The additional suggestion was incorrect. Disk serial operation is faster than multithread that well known fact. As I pointed out in question it's a benchmarking task by a company so I'm doing as we were asked. – sapy Mar 14 '18 at 02:46
  • @GillesGouaillardet those 2 lines of minor change fixes the issue :) – sapy Mar 14 '18 at 02:47
  • "well known fact" if you write to a single physical mechanical disk. the performance issues are overhead in create/join and the fact that only one thread ends up working at a given time, so there is no concurrency but only overhead. bottom line, you are not benchmarking anything except the overhead of your program. use `iozone` instead ... – Gilles Gouaillardet Mar 14 '18 at 02:50
  • I know about all available benchmarking tool. Do you know 2 threads can work faster than 1 thread depending on what you are trying to write(depending on what computation involved)? Let's be less opinionated and live a happy life. It's just the stripped down version of a 10000 line programme that run on grid . So I know what I'm doing :) – sapy Mar 14 '18 at 03:31
  • 1
    let's live a happy life then :-) – Gilles Gouaillardet Mar 14 '18 at 04:05