4

Is there any thread safe implementation of nftw() in C/C++? In the documentation it says

"The nftw() function need not be thread-safe."

I'm going to use nftw for a recursive delete function to walk through the directory structure in a multi threaded application.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
multiholle
  • 3,050
  • 8
  • 41
  • 60
  • 1
    You're probably not going to get any speedup anyway because walking the directory tree is I/O bound. – Mysticial Jul 13 '12 at 01:10
  • 1
    OP did not say he/she wants a multi-threaded `nftw` or to perform multiple directory tree walks in different threads for performance purposes. The problem is that, per the specification, `nftw` is not safe for multi-threaded use, so if the application is multi-threaded, one must make special (often prohibitively costly) efforts to ensure that it's not possible to invoke it in more than one thread at once. – R.. GitHub STOP HELPING ICE Jul 13 '12 at 01:25
  • 1
    Does anyone know why nftw is not thread safe? – Kjell Hedström Apr 21 '14 at 15:54

1 Answers1

6

One trivial way to make a non-thread-safe function thread-safe is to wrap it in a function that obtains a lock before calling it, and always call it through this wrapper. In general you would need to copy out the results before unlocking, but nftw does not yield any results that would need to be copied after it returns. A few caveats though:

  1. This will of course prevent all parallelism when multiple threads want to use the interface.

  2. One option to nftw makes it chdir to each directory it walks. This is a very bad thing for a multi-threaded app (since the current directory is shared by all threads), so you should avoid using this option.

On POSIX 2008 systems with the openat and related interfaces, it's pretty trivial to implement your own equivalent of nftw without any chdir usage or pathname length limitations, so you might be better off just writing your own.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 1
    +1 ha, I didn't consider the possibility that the OP might just be after correctness rather than performance. – Mysticial Jul 13 '12 at 01:37
  • ..and of course, even more trivial is to ensure that your code only calls `nftw()` from a single thread. You can also use `chdir()` if you treat the current working directory as a resource in its own right, and protect it by a mutex. – caf Jul 13 '12 at 01:57
  • @caf: If you `chdir`, then any use of relative pathnames in your program must be done under lock. I would rather treat the current working directory as a constant, the directory in which the program was initially invoked, or do away with it entirely by `chdir('/')` in long-lived programs where you don't want to possibly prevent unmounting a filesystem. – R.. GitHub STOP HELPING ICE Jul 13 '12 at 02:37
  • 1
    @caf: And I think you overestimate the "triviality" of ensuring `nftw` is only called from a single thread. If you have a gui program where each window runs in a thread, or a server where each connection runs in a thread, it's natural for them to perform their own operations, and unnatural for them to have to worry about what other threads are doing when they're not actually working with shared data. – R.. GitHub STOP HELPING ICE Jul 13 '12 at 02:39