1

I'm trying to findout if a folder is actually a hard link to another, and in that case, findout its real path.

I did a simple example in python in the following way(symLink.py):

#python 3.4
import os
dirList = [x[0] for x in os.walk('.')]

print (dirList)

for d in dirList:
    print (os.path.realpath(d), os.path.islink(d))

"""
Given this directories structure:
<dir_path>\Example\
    <dir_path>\Example\symLinks.py
    <dir_path>\Example\hardLinkToF2 #hard link pointing to <dir_path>\Example\FOLDER1\FOLDER2
    <dir_path>\Example\softLinkToF2 #soft link pointing to <dir_path>\Example\FOLDER1\FOLDER2
    <dir_path>\Example\FOLDER1
        <dir_path>\Example\FOLDER1\FOLDER2

The output from executing: C:\Python34\python <dir_path>\Example\symLinks.py is:
['.', '.\\FOLDER1', '.\\FOLDER1\\FOLDER2', '.\\hardLinkToF2']
<dir_path>\Example False
<dir_path>\Example\FOLDER1 False
<dir_path>\Example\FOLDER1\FOLDER2 False
<dir_path>\Example\hardLinkToF2 False
"""

In this example os.path.islink always returns False both for a hard or a soft link. In the other hand, os.path.realpath returns the actual path for soft links, not for the hard links.

I've made this example using python 3.4 in Windows 8. I have no clue if I am doing something wrong or if there is another way to achieve it.

Andres Tiraboschi
  • 543
  • 1
  • 7
  • 17
  • A hard link points to the same inode as the original file, but it doesn't refer to the original file. Therefore, I'm not sure that, given a hard link, you can determine the original file. A symbolic link refers to the original file by name. Therefore, it **is** possible to get to the original file when given a symbolic link. So I think the behavior you are describing is just how it works. – RobertB Jan 30 '17 at 17:16
  • Secondly, the docs say that `islink` returns "Always False if symbolic links are not supported by the Python runtime". Perhaps that is relevant to the symbolic link behavior you are seeing. – RobertB Jan 30 '17 at 17:22
  • 1
    Agree with @RobertB: Two hard links to the same file are supposed to be indistinguishable, they're not actually "links" from the perspective of anyone using them, all you can tell is that the underlying file is referenced in N different places. The best you could do is scan the whole file system until you found all the entries with the same inode number. Just to be clear, are you using actual symbolic links, or NTFS junctions? It looks like [junctions aren't properly detected as links](https://bugs.python.org/issue29250). – ShadowRanger Jan 30 '17 at 17:33
  • On Windows Vista and later this is easy for hard links. Use [`FindFirstFileNameW`](https://msdn.microsoft.com/en-us/library/aa364421), `FindNextFileNameW`, and `FindClose`. You can use ctypes for this, or if you have PyWin32 installed use [`win32file.FindFileNames`](http://docs.activestate.com/activepython/3.4/pywin32/win32file__FindFileNames_meth.html) (seems to have a bug leaving a trailing NUL on the filename). – Eryk Sun Jan 30 '17 at 20:41
  • 1
    Note that NTFS does not allow hard linking directories, in which case you could instead have a reparse point (junction, symlink, etc). – Eryk Sun Jan 30 '17 at 20:46

2 Answers2

1

Not to bee too harsh, but I spent 1 minute googling and got all the answers. Hint hint.

To tell if they are hardlinks, you have to scan all the files then compare their os.stat results to see if they point to the same inode. Example:

https://gist.github.com/simonw/229186

For symbolic links in python on Windows, it can be trickier... but luckily this has already been answered:

Having trouble implementing a readlink() function

(per @ShadowRanger in comments), make sure you are not using junctions instead of symbolic links since they may not report correctly. – ShadowRanger

https://bugs.python.org/issue29250

Community
  • 1
  • 1
RobertB
  • 1,879
  • 10
  • 17
  • 1
    That second link shouldn't be needed. It's specific to pre-3.2 versions of Python, where `realpath` (on Windows) was an alias for `abspath`. On 3.4, as the OP's question shows, it looks like the paths are being resolved (although it's possible they're being resolved by `os.walk` before `realpath` gets to them, since `os.walk` doesn't understand junctions, and [follows them even with `follow_symlinks=False`](https://bugs.python.org/issue23407), the default). `realpath` should work on 3.4, it's just that [`islink` might not report junctions correctly](https://bugs.python.org/issue29250). – ShadowRanger Jan 30 '17 at 18:03
  • Hmm... Never mind. Looks like, at least on my Python 3.4, `realpath` remains an alias for `abspath`. Blech. – ShadowRanger Jan 30 '17 at 18:18
  • You can just test `os.path.realpath is os.path.abspath` to check. `islink` reports `False` for the junctions I can test right now, and `realpath` doesn't follow the `symlink`. `os.walk` seems to exclude the junction from traversal only due to a permissions issue that's being silently ignored; the only junctions I can test are the ones the weirdly permissioned ones the OS ships with. Only way to test I can figure out if it's a junction is using `lstat` and `stat` and using `os.path.samestat` (only on Windows since 3.4) to determine if it refers to the same thing. – ShadowRanger Jan 30 '17 at 18:31
  • 2
    This answer is accurate but the code example shouldn't be on an external site and the social approach is a bit rude tbh. – monokrome Aug 25 '20 at 06:09
  • You can check if two files are hardlinks to the same file with `os.path.samefile(file1,file2)`. – mmj Mar 11 '21 at 09:21
1

Links to directories on Windows are implemented using reparse points. They can take the form of either "directory junctions" or "symbolic links". Hard links to directories are not possible on Windows NTFS.

At least as of Python 3.8 os.path.samefile(dir1, dir2) supports both symbolic links and directory junctions and will return True if both resolve to the same destination.

os.path.realpath(dirpath) will also work to give you the real (completely resolved) path for both symbolic links and directory junctions.

If you need to determine which of the two directories is a reparse point, you can leverage os.lstat() as os.path.islink() only supports symbolic links.

import os
import stat

def is_reparse_point(dirpath):
    return os.lstat(dirpath).st_file_attributes & stat.FILE_ATTRIBUTE_REPARSE_POINT

Insofar as it may be valuable for testing, here are some useful utilities available in the Windows CMD shell.

Interrogate reparse point data:

>fsutil reparsepoint query <path to directory>

Create reparse points of both the "symbolic link" and "directory junction" variety *:

>mklink /d <symbolic link name> <path to target directory>
>mklink /j <junction name>      <path to target directory>

You can read more about the difference between hard links and junctions, symbolic links, and reparse points in Microsoft's docs.

*Note that creating symbolic links typically requires Administrator privileges.