4

I'm trying to find an easy way to map a URI to a Path without writing code specific to any particular file system. The following seems to work but requires a questionable technique:

public void process(URI uri) throws IOException {
    try {
        // First try getting a path via existing file systems. (default fs)
        Path path = Paths.get(uri);
        doSomething(uri, path);
    }
    catch (FileSystemNotFoundException e) {
        // No existing file system, so try creating one. (jars, zips, etc.)
        Map<String, ?> env = Collections.emptyMap();
        try (FileSystem fs = FileSystems.newFileSystem(uri, env)) {
            Path path = fs.provider().getPath(uri);  // yuck :(
            // assert path.getFileSystem() == fs;
            doSomething(uri, path);
        }
    }
}

private void doSomething(URI uri, Path path) {
    FileSystem fs = path.getFileSystem();
    System.out.println(uri);
    System.out.println("[" + fs.getClass().getSimpleName() + "] " + path);
}

Running this code on a couple examples produces the following:

file:/C:/Users/cambecc/target/classes/org/foo
[WindowsFileSystem] C:\Users\cambecc\target\classes\org\foo

jar:file:/C:/Users/cambecc/bin/utils-1.0.jar!/org/foo
[ZipFileSystem] /org/foo

Notice how the URIs have been mapped to Path objects that have been "rooted" into the right kind of FileSystem, like the Path referring to the directory "/org/foo" inside a jar.

What bothers me about this code is that although NIO2 makes it easy to:

  • map a URI to a Path in existing file systems: Paths.get(URI)
  • map a URI to a new FileSystem instance: FileSystems.newFileSystem(uri, env)

... there is no nice way to map a URI to a Path in a new FileSystem instance.

The best I could find was, after creating a FileSystem, I can ask its FileSystemProvider to give me Path:

Path path = fs.provider().getPath(uri);

But this seems wrong as there is no guarantee it will return a Path that is bound to the FileSystem that I just instantiated (i.e., path.getFileSystem() == fs). It's pretty much relying on the internal state of FileSystemProvider to know what FileSystem instance I'm referring to. Is there no better way?

cambecc
  • 4,083
  • 1
  • 23
  • 24

2 Answers2

5

You found a bug in the implementation/documentation of the zipfs.The documentation of the Path.get methods states:

* @throws  FileSystemNotFoundException
*          The file system, identified by the URI, does not exist and
*          cannot be created automatically

edit: In the case of FileSystems that need closing it might be better to require the programmer to call newFileSystem so that he can close it. The documentation should better read "if it should not be created" automatically.

ZipFs never tries to create a new filessystem. A failed get() is not caught but passed to be caller before an attempted newFileSystem call. See in the source:

public Path getPath(URI uri) {

    String spec = uri.getSchemeSpecificPart();
    int sep = spec.indexOf("!/");
    if (sep == -1)
        throw new IllegalArgumentException("URI: "
            + uri
            + " does not contain path info ex. jar:file:/c:/foo.zip!/BAR");
    return getFileSystem(uri).getPath(spec.substring(sep + 1));
}

In other words:

Paths.get()

should be enough for all FileSystems based on nio2. With the zipfs design.

Path path;
try {
   path = Paths.get( uri );
} catch ( FileSystemNotFoundException exp ) {
   try( FileSystem fs = FileSystems.newFileSystem( uri, Collections.EMPTY_MAP )) {;
       path = Paths.get( uri );
       ... use path ...
   }
}   

Is the short form of your workaround.

Note: The nio documentation states that the getFileSystem must use/return the FileSystems created by the matching newFileSystem.

openCage
  • 2,735
  • 1
  • 18
  • 24
  • I too was wondering if this were a bug, but I think explicitly creating the file system using `FileSystems.newFileSystem` makes sense. Creating a new zipfs means _opening_ the underlying zip file, a file that must eventually be closed. This is why `FileSystem` implements `Closeable`. So the reason `Paths.get(uri)` does not automatically open a zipfs is because the designers wanted the opening and closing of the zipfs `FileSystem` object to be explicitly done by the programmer. At least, that's my conjecture. :) That's why I used try-with-resources in my sample code above. – cambecc Apr 26 '13 at 06:33
  • @cambecc I changed my answer. The difference is between 'should be opened automatically' and 'can be opened automatically'. A – openCage Apr 29 '13 at 11:23
  • 1
    @cambecc If you rely on the oracle documented behaviour of getFileSystem to return only and exactly the previously openen FileSystems form newFileSystem you can write the code as in the changed answer. This only leaves a strange feeling about closing a FileSystem that is in use at a different place / thread. – openCage Apr 29 '13 at 11:29
  • Yes, it's exactly that "strange feeling" I wish could be avoided. Just like in my original example, there is no guarantee `Paths.get(uri)` will return a `Path` rooted in the `FileSystem` that was just created on the previous line. – cambecc Apr 30 '13 at 07:02
  • The crux of the problem is highlighted by the javadoc for `FileSystemProvider#getPath(URI)` (which btw is what `Paths.get(URI)` eventually invokes). Here is what the javadoc says: _Return a Path object by converting the given URI. The resulting Path is associated with a FileSystem that already exists or is constructed automatically._ So... if 19 separate instances of `FileSystem` exist for that provider, which of them gets associated with the resulting `Path` object? I've not found a way to explicitly choose the `FileSystem` instance to associate, so I believe this is a flaw in the API. – cambecc Apr 30 '13 at 07:11
  • @cambecc The 19 different FileSystems must result in different URIs, e.g. memoryfs:foo1 ... memoryfs:foo19/ are URIs for roots in their filesystem. The default filesystem exits only once and needs no distinguishing information. All filesystems providers allowing multiple filesystems need URIs of the form :. The form of the filesystemid is implementation dependent. – openCage May 28 '14 at 21:22
2

Q: "I'm trying to find an easy way to map a URI to a Path without writing code specific to any particular file system"

A: There is no such way

The whole question is only interesting if the filesystem associated with the URI is not open yet, i.e. when getFileSystem (in Paths.get) throws FileSystemNotFoundException. But to call newFileSystem you need to know 2 things:

  • what (part of the) URI to use to create the new filesystem. The docu says that i.e. in the case of the default filesystem the path component of the URI must be a root. e.g. getFileSystem( URI.create( "file:///duda", Collections.EMPTY_MAP) fails.
  • what to set in the environment map, e.g. might be a password.

So to create a new filesystem from an URI you must have knowledge about the filesystem to create.

openCage
  • 2,735
  • 1
  • 18
  • 24