6

I am creating tree of files and folder. I am rewriting to multithreading. The only weak point I see is when creating folders. Right now it goes one by one (in depth). Before I write down the file I check if the path exists. If not, I use mkdirs to create all that is missing.

public void checkDir(String relativePath) {
        File file = new File(homePath + relativePath);
        if (!file.exists()) {
            if (file.mkdirs()) {
                log.info("Directory: " + homePath + relativePath + " is created!");
            } else {
                log.error("Failed to create directory: " + homePath + relativePath + " !");
            }
        }
    }

I have a question what happens when I use two threads. One has path A/B/C and the other one A/B/D. Let's say I have only A folder existing but not B. So both of them will check that the path doesn't exist and want to create it. So one of them will probably fail because the other one will be faster. So how can I manage this?

  1. I was thinking about removing the exists condition and leave it fail but there is no AlreadyExists exception I could catch..
  2. Create directory tree first (but I think there will be better way?)
  3. Put directory creation as critical section and make it sequential - not sure how to do this in spring, but anyway hot sure it is neccessary and won't slow down the process too much.

Maybe I am overthinking too much, but theoretically such situation might happen. Currently I use regular Thread, but I want to use spring TaskExecutor for this. It handles critical sections on its own, but this is not shared variable or anything and the path is different so I think it won't recognize it.

Thanks for suggestions.

Elis.jane
  • 259
  • 2
  • 4
  • 10
  • I wont fail if one is faster java will realize that the directory exists and wont make it. –  Apr 07 '15 at 09:48
  • Since this does not seem performance critical, why not just synchronise the method? – Zarathustra Apr 07 '15 at 11:01
  • You should take into consideration Kári answer. Atomic does not automatically implies the operation is synchronized. – Antonio Dec 20 '16 at 08:34
  • Note that concatenation of strings to create file paths is error prone and fragile. As if representing files are strings in the first place. I would suggest you rewrite your code to use `Path` from the outset and use methods like `relativize` and `resolve` on `Path` to process these paths. Finally, use `Files` to carry out operations as it’s error handling is a lot better. TL;DR don’t use `String` to represent typed data and don’t use the `File` API. – Boris the Spider Oct 13 '19 at 09:03

3 Answers3

7

The File.mkdirs() method is specified to create the directory, and all its parents, if they don't exist. Ergo there's no point in calling exists(). Existence will get checked anyway. Calling exists() is just wasting time. mkdirs() is essentially an atomic operation: there is really no point in trying to out-think it.

Note that a return value of false isn't necessarily a failure. It might just indicate that all the directories in the path already existed.

Basically the premiss of your question is false.

user207421
  • 305,947
  • 44
  • 307
  • 483
5

None of the answers seem to address the issue of whether mkdirs() is thread safe, one answer states that mkdirs() is atomic but there might exist cases where this fails. This function essentially deals with the file system so it probably involves a system call to the operating system on the respective host, and determining whether those system calls are actually thread safe could be impossible if you don't already know the target system your application will be used on.

For example, even though mkdirs() checks for existence before creating the folder structure what would happen in the following case,

Thread 1 calls mkdirs(), it inherently checks for the existence of the folder structure and determines it doesn't exist. At that time, Thread 1 gets pre-empted.

Thread 2 calls mkdirs(), it inherently checks for the existence of the folder structure and determines that it doesn't exist and subsequently moves on to create the folder structure.

Thread 1 starts up again and contines to try creating the folder structure with the previous determination that it didn't exist before.

What happens there? I don't know, this sequence of events would be hard to test and especially with the knowledge that the create folder system call varies between operating systems. Your best bet for thread safety and to avoid introducing errors that would be potentially difficult to track and debug would be to implement a degree of mutual exclusion at this critical section in the code.

I guess it would be easy to take a naive approach and declare a single 'global' variable that both threads would have access to, for example a Boolean b and then add the following code around your critical section,

synchronized(b) {
     // Your critical section here
}

This would guarantee that if one thread has locked b it alone will only get to access the critical section while the other one waits, thus making sure that mkdir() won't get called by both threads.

However, if you want to learn more about multi-threading and how to implement mutual exclusion on a lower level, in this case I'd suggest you look at semaphores and how they could be implemented to solve this.

Kári
  • 115
  • 3
  • I agree with this explanation. Atomic does not automatically implies the operation is synchronized. – Antonio Dec 20 '16 at 08:33
  • You don't know whether 'it inherently checks for the existence'. If I was implementing it, I wouldn't check anything. I would just attempt to create all the directories concerned, from parent to child, each one of which is atomic inside the kernel, and then return true if and only if all the creations succeeded. No multithreading issue could therefore possibly arise. And I would assume that the Sun guys saw this as well. – user207421 Oct 13 '19 at 08:58
1

As EJP points out a return of false can mean many things, some errors, some not. If you want to log the fact it was actually unable to create the directory, you should check the existence afterwards:

public final class DirectoryHelper {
   private DirectoryHelper(){}

   public static boolean createDirectories(File path) {
      if (path.mkdirs()) return true; //definitely has new dir
      // if false, just look afterwards for the existence of the directory
      // also opportunity to throw own exceptions if you prefer
      return path.exists() && path.isDirectory();
   }
}

I've written here a new method that returns false only if the directory is not there afterwards. I don't care if it's just been made or existed already. Due to the new order I don't need a synchronized block either.

Your code then looks like:

public void checkDir(String relativePath) {
    File file = new File(new File(homePath), relativePath);
    if (!file.exists()) { // just to prevent logging of existing dirs
        if (DirectoryHelper.createDirectories(file)) {
            log.info("Directory: " + file + " is created!");
        } else {
            log.error("Failed to create directory: " + file + " !");
        }
    }
};
weston
  • 54,145
  • 21
  • 145
  • 203