0

I often times use filepaths to provide some sort of unique id for some software system. Is there any way to take a filepath and turn it into a unique integer in relatively quick (computationally) way?

I am ok with larger integers. This would have to be a pretty nifty algorithm as far as I can tell, but would be very useful in some cases.

Anybody know if such a thing exists?

Alexander Mills
  • 90,741
  • 139
  • 482
  • 817

3 Answers3

1

You could try the inode number:

fs.statSync(filename).ino
David Jones
  • 2,879
  • 2
  • 18
  • 23
1

@djones's suggestion of the inode number is good if the program is only running on one machine and you don't care about a new file duplicating the id of an old, deleted one. Inode numbers are re-used.

Another simple approach is hashing the path to a big integer space. E.g. using a 128 bit murmurhash (in Java I'd use the Guava Hashing class; there are several js ports), the chance of a collision among a billion paths is still 1/2^96. If you're really paranoid, keep a set of the hash values you've already used and rehash on collision.

Gene
  • 46,253
  • 4
  • 58
  • 96
1

This is just my comment turned to an answer. If you run it in the memory, you can use one of standard hashmaps in your corresponding language. Not just for file names, but for any similar situation. Normally, hashmaps in different programming languages are satisfying collisions by buckets, so the hash number and the corresponding bucket number will provide a unique id.

Btw, it is not hard to write your own hashmap, such that you have control on the underlying structure (e.g. to retrieve the number etc).

Saeed Amiri
  • 22,252
  • 5
  • 45
  • 83
  • In my case, with multiple processes (not all same shared memory) it might be harder to this right – Alexander Mills Jan 10 '17 at 20:48
  • Yes, it is maybe not easy, but to overcome this, you can write a service which maintains this hashmap. Again normally a decent programming language has support for concurrency. So from all other applications you may call that service to give you a unique number. From time to time (e.g. midnights), you can automatically save those unique numbers in database or a file by that service. This way you won't lose information. – Saeed Amiri Jan 11 '17 at 00:58