7

Im on Linux and my Java application is not intended to be portable.

I'm looking for a way to identify a file uniquely in Java. I can make use of statfs syscall since the pair (f_fsid, ino) uniquely identifies a file (not only across a file system) as specified here: http://man7.org/linux/man-pages/man2/statfs.2.html

The question is if it is possible extract fsid from Java directly so I can avoid writing JNI function?

inode can be extracted with NIO, but how about fsid? inode and fsid comes from different structure and are operated by different syscalls...

Some Name
  • 8,555
  • 5
  • 27
  • 77
  • Pardon my ignorance, but wouldn't a file system path (e.g. `/home/user/file.txt`) identify the file? – Karol Dowbecki Dec 26 '18 at 18:23
  • 1
    @KarolDoabecki Unfortunately identifying simply by file path are not tolerable to renaming and is prone to race conditions in case we are operating on files concurrently. – Some Name Dec 26 '18 at 18:24
  • 2
    @KarolDowbecki not in the case of a symlink – HairOfTheDog Dec 26 '18 at 18:25
  • 1
    *if* (and this is a big if) you need to go the native route and depending on your performance requirements I would suggest using JNA over JNI because JNA doesn't require that you write a native support library. I was surprised by how easy it was to use JNA. https://github.com/java-native-access/jna – HairOfTheDog Dec 26 '18 at 18:40

2 Answers2

4

This java example demonstrates how to get the unix inode number of a file.

import java.nio.file.*;
import java.nio.file.attribute.*;

public class MyFile {

  public static void main(String[] args) throws Exception  {

    BasicFileAttributes attr = null;
    Path path = Paths.get("MyFile.java");

    attr = Files.readAttributes(path, BasicFileAttributes.class);

    Object fileKey = attr.fileKey();
    String s = fileKey.toString();
    String inode = s.substring(s.indexOf("ino=") + 4, s.indexOf(")"));
    System.out.println("Inode: " + inode);
  }
}

The output

$ java MyFile
Inode: 664938

$ ls -i MyFile.java 
664938 MyFile.java

credit where credit is due: https://www.javacodex.com/More-Examples/1/8

HairOfTheDog
  • 2,489
  • 2
  • 29
  • 35
  • If we are operating across the same file system it seems to work fine. But the inode of two files can be the same (even unlikely) if they are in some different directories which are on different file systems... cant they? – Some Name Dec 26 '18 at 18:31
  • Or this inode includes fsid as well...? – Some Name Dec 26 '18 at 18:33
  • 2
    @SomeName How about the FileStore which you can access via `Files.getFileStore`? https://docs.oracle.com/javase/8/docs/api/java/nio/file/FileStore.html https://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#getFileStore-java.nio.file.Path- – HairOfTheDog Dec 26 '18 at 18:47
  • Looks like this is exactly what I need. Thanks! – Some Name Dec 26 '18 at 18:49
  • @SomeName what exactly do you want to do with the file key? If you just want to find out, whether two files are the same, just use the `equals` method of the object returned by `attr.fileKey()`. As long as you’re not trying to interpret the object as inode, you do not have to deal with the ambiguity of inodes (of different file stores). – Holger Jan 07 '19 at 12:52
  • @Holger My intention is to persist some file processing state every time files are modified. So I uniquely identify every file by the key. Since I'm listening to inotify events on _multiple_ directories (configurable) so they may not share the same filesystem and just `inode` is not enough. The problem is another process may rename the file within the directory it is contained in so I decided to open file first and get its file descriptor. By the file descriptor I can acquire the key and be protected against renaming. Does the scenario look reasonable? – Some Name Jan 08 '19 at 09:41
  • 2
    @SomeName it’s not really clear at which point using Java code or using native code really is required. Java NIO has a way to uniquely identify files (the object returned by `fileKey()` does that) and it also has a watch service abstraction. Note that the code of the answer goes great lengths to destroy the uniqueness of the file key. It first converts it to a `String`, then extracts the not-unique inode from it. If you look at the original string representation, you’ll see something like `"(dev=xxx,inode=yyy)"`, clearly hinting at the fact that the key object does uniquely identify the file. – Holger Jan 08 '19 at 12:51
  • @Holger Maybe it's overkill, but I really do not know how to solve the problem strictly in Java. Unfortunately `WatchService` does not support `MOVED_FROM`/`MOVED_TO` events and treats them as `DELETE`/`CREATE` so in my case it's out. Also I cannot just use `Path` got from `struct inotify_event` to get the attribute because the file by the `Path` can be moved and a new one is created while processing the event. So all IO is currently done via wrappers of JNI functions. – Some Name Jan 08 '19 at 13:12
2

I would suggest the GIT method of hashing the file contents. This is proof against copying and renaming.

Java is supposed to be platform independent so using Unix specific methods may not be what you want.

Jonathan Rosenne
  • 2,159
  • 17
  • 27