8

Under Linux, I have two file paths A and B:

const char* A = ...;
const char* B = ...;

I now want to determine, should I open(2) them both...

int fda = open(A, ...);
int fdb = open(B, ...);

...will I get two filehandles open to the same file in the filesystem?

To determine this I thought of stat(2):

struct stat
{
    dev_t st_dev;
    ino_t st_ino;
    ...
}

Something like (pseudo-code):

bool IsSameFile(const char* sA, const char* sB)
{
    stat A = stat(sA);
    stat B = stat(sB);

    return A.st_dev == B.st_dev && A.st_ino == B.st_ino;
}

Are there any cases where A and B are the same file but IsSameFile would return false?

Are there any cases where A and B are different files but IsSameFile would return true?

Is there a better way to do what I'm trying to do?

Andrew Tomazos
  • 66,139
  • 40
  • 186
  • 319
  • You can have multiple file descriptors that refer to the same file, yes. – teppic Mar 27 '13 at 02:12
  • @teppic: Yes, and you can also have multiple file descriptors that refer to different files. My question is how do I determine which of those two universes I am in (or would-be in) – Andrew Tomazos Mar 27 '13 at 02:17
  • If you do have file descriptors open, you can just use `fstat` directly on them - if the inodes and device numbers are equal, it is impossible for the two paths to refer to different files. – teppic Mar 27 '13 at 02:41

2 Answers2

4

Your program will work fine in all the cases because A.st_ino will return the inode number of the files in your system. Since inode number is unique your program will correctly identify whether the two files opened are same or not.

You can also check the value of A.st_mode to find out whether the file is a symbolic link.

Deepu
  • 7,592
  • 4
  • 25
  • 47
  • 3
    You can only find out that a name is a (broken) symlink via `stat()` if it is in fact a broken symlink. If it is non-broken, `stat()` reports on the file or device at the end of the link; `lstat()` reports on the (first) symlink if the name is a symlink. – Jonathan Leffler Mar 27 '13 at 02:11
0

It depends on why exactly you want to avoid opening the same file twice. Your solution is usually the correct one, but there are some situations where files should be considered the same if they have the same absolute path but not if they are links to the same inode. In that case you need to convert the paths to absolute paths and compare them ... see Getting absolute path of a file

You also need to decide whether you consider a symlink to a file equivalent to the file or another symlink to it. For inode equivalence, that determines whether to use stat or lstat. For path equivalence, it determines whether you can use realpath or if you need to get the absolute path without following symlinks.

Community
  • 1
  • 1
Jim Balter
  • 16,163
  • 3
  • 43
  • 66
  • 2
    Using `stat()`, the code will be oblivious to symlinks (except perhaps broken ones). Can you elaborate on 'some situations where files should be considered the same if they have the same absolute path but not if they are links to the same inode'? – Jonathan Leffler Mar 27 '13 at 02:12
  • @JonathanLeffler "Using stat(), the code will be oblivious to symlinks" -- but not using *lstat* -- that's the exactly the distinction I made. Elaborating: some backup schemes require files to be copied once for each path (esp. if restoration will be made to an fs that doesn't support hard links), while there's no point in saving the same path twice. They may be other use cases. But as I said, inode equivalence is **usually** what is wanted. – Jim Balter Mar 27 '13 at 02:22
  • @JonathanLeffler And actually it is `lstat` that is oblivious to symlinks, whereas `stat` does an effective `readlink` on them and follows them. In fact, the implementation of `lstat` is just what the implementation of `stat` was before there were symlinks (egads, I was writing UNIX kernel code back then). – Jim Balter Mar 27 '13 at 02:27