1

If you had a base file directory with an unknown amount of files and additional folders with files in them and needed to rename every file to append the date it was created on,

i.e filename.ext -> filename_09_30_2021.ext

Assuming the renaming function was already created and returned 1 on success, 0 on fail and -1 on error,

int rename_file(char * filename)

I'm having trouble understanding how you would write the multi-threaded file parsing section to increase the speed.

Would it have to first break down the entire file tree into say 4 parts of char arrays with filenames and then create 4 threads to tackle each section?

Wouldn't that be counterproductive and slower than a single thread going down the file tree and renaming files as it finds them instead of listing them for any multi-threading?

mov eax
  • 41
  • 5
  • Multi-threading may improve performance in case you encounter a subdirectory and you let another thread do the job inside that subdirectory, so the main thread continues with the current directory. – Luca Polito Sep 30 '21 at 09:36
  • Multithreading can enable concurrency for looping over a directory and renaming. The latter might be slow compared to scanning the entries. – the busybee Sep 30 '21 at 09:48
  • 1
    Splitting a problem and running it in multiple threads may reduce performance as well. Search for "pipeline pingpong" and "locality of reference". That said, instead of splitting by directory, you could also split by task, i.e. one thread to locate files and one to rename them. Why would you want to multithread this in the first place? What are your actual observations that lead you to the conclusion that some kind-of multithreading would help? Also, this may well be an IO-bound task, which you can't speed up by assigning it to multiple CPUs at all. – Ulrich Eckhardt Sep 30 '21 at 10:24
  • I think you should replace all occurences of multi-threading by recursion, here. – Wör Du Schnaffzig Sep 30 '21 at 11:16

1 Answers1

2

Wouldn't that be counterproductive and slower than a single thread going down the file tree and renaming files as it finds them instead of listing them for any multi-threading?

In general, you get better performance from multithreading for intensive cpu operations. In this case, you'll probably see little to no improvement. It's even quite possible that it gets slower.

The bottleneck here is not the cpu. It's reading from the disk.

Related: An answer I wrote about access times in general https://stackoverflow.com/a/45819202/6699433

klutt
  • 30,332
  • 17
  • 55
  • 95