0

I created 50 threads to read the same file at the same time and then, in each thread, tried to write its content to new file that create with different name. The code was supposed to generate 50 different files. But I got unexpected results that it just generate 3~5 files. Results after running my code

When all the read the same file, there is no race-condition, and each thread is aimed to write its content to different file.

Can somebody help me? Thank you!

My code is listed below and it is a modification from Reference

#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <string.h>
#include <iostream>
#include <vector>
#include <thread>

void copy(const char *src_path, const char *dst_path);

int main(int argc, char **argv)
{
    std::vector<std::thread> vth;
    char *tmp = "Copy.deb";
    for (int i = 0; i < 50; ++i)
    {
        char src[40];
        memset(src, '\0', sizeof(src));
        sprintf(src, "%d", i);
        strcat(src, tmp);
        vth.emplace_back(std::bind(copy, "Original.deb", strdup(src)));
    }
    for (int i = 0; i < 50; ++i)
    {
        vth[i].join();
    }

    return 0;
}

void copy(const char *src_path, const char *dst_path)
{
    FILE *src, *dst;
    int buffer_size = 8 * 1024;
    char buffer[buffer_size];
    size_t length;

    src = fopen(src_path, "rb");
    dst = fopen(dst_path, "wb");

    while (!feof(src))
    {
        length = fread(buffer, 1, buffer_size, src);
        fwrite(buffer, 1, length, dst);
    }

    fclose(src);
    fclose(dst);
}
George
  • 35
  • 4
  • The recommended number of threads is the number of cores multiplied by two, plus one. 50 threads might as well be *slower* than a single-threaded application, not only because CPU core starvation but also because of I/O starvation. – Some programmer dude May 19 '20 at 05:58
  • 3
    This code is not C:( – Martin James May 19 '20 at 05:59
  • But you have a much worse problem, that leads your code to have *undefined behavior*: You pass a pointer to the local array `src` to all threads. This array will go out of bounds and cease to exist each iteration of the loop! The pointer might become invalid at any moment in your thread. – Some programmer dude May 19 '20 at 05:59
  • 1
    Please adjust your tags, your code looks like `c++`, not `c`. – rtx13 May 19 '20 at 06:01
  • 1
    Furthermore, please read [Why is “while ( !feof (file) )” always wrong?](https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong) And if you decide to use C++ streams instead (which you really should) [Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?](https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-i-e-while-stream-eof-cons) – Some programmer dude May 19 '20 at 06:01
  • Finally, [here's a list of good C++ books](https://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list/388282#388282). Please invest in a couple and learn C++ properly. Or take a couple of classes. – Some programmer dude May 19 '20 at 06:02
  • @Someprogrammerdude, The number of CPU cores is relevant when CPU cores are the bottleneck. In this program (assuming the mistake called out by Jeremy Friesner is fixed), I/O almost certainly will be the bottleneck. – Solomon Slow May 19 '20 at 13:03

1 Answers1

2

I believe your problem is that you are passing src (which is a pointer to a local variable on your main thread's stack) to your thread's entry function, but since your copy() function runs asynchronously in a separate thread, the char src[40] array that you are passing a pointer to has already been been popped off of the main thread's stack (and likely overwritten by other data) before the copy() function gets a chance to read its contents.

The easy fix would be to make a copy of the string on the heap, so that you can guarantee the string will remain valid until the the copy() function executes and reads it:

    vth.emplace_back(std::bind(copy, "Original.deb", strdup(src)));

... and be sure to have your copy() function free the heap-allocation when it's done using it:

void copy(const char *src_path, const char *dst_path)
{
  FILE *src, *dst;
  int buffer_size = 8 * 1024;
  char buffer[buffer_size];
  size_t length;

  src = fopen(src_path, "rb");
  dst = fopen(dst_path, "wb");

  free(dst_path);   // free the string previously allocated by strdup()

  [...]

Note that you don't currently have the same problem with the "Original.deb" argument since "Original.deb" is a string-literal and therefore stored statically in the executable, which means it remains valid for as long as the program is running -- but if/when you change your code to not use a string-literal for that argument, you'd likely need to do something similar for it as well.

Jeremy Friesner
  • 70,199
  • 15
  • 131
  • 234