1

Path.GetRandomFileName returns a crypto-strong string of 11 random chars. It is built of 8 chars + dot + 3 chars, like: "b253i5vu.psf". A char can be a lowercase letter or a number between 0 and 5. So the number of possible strings is 32^11.

My question have been already asked here and here, but the answers are all wrong, because:

  1. They say that the chance for a collision is some X value, while in fact it depends on how many files there are already in the directory. So, for ex., if you have a dir with 100,000 files that were generated with Path.GetRandomFileName, the chance for a collision is higher then for a dir with 1000 files.

  2. They don't take into account the Birthday Problem.

If possible, I appreciate if you can present the formula in a way which is easy to use for people who don't have university-level math knowledge, or if you can give instructions of how to calculate it for a specific value (for ex., if the dir has 1000 files).

Community
  • 1
  • 1
Bohoo
  • 1,099
  • 11
  • 25
  • Possible duplicate of [Probability of already existing file System.IO.Path.GetRandomFileName()](http://stackoverflow.com/questions/27945559/probability-of-already-existing-file-system-io-path-getrandomfilename) – Mike Nakis Feb 11 '17 at 15:00
  • "Chances for" is synonym for "Probability of", and "collision" has exactly the same meaning as "already existing", so this question is an exact duplicate of http://stackoverflow.com/questions/27945559/probability-of-already-existing-file-system-io-path-getrandomfilename – Mike Nakis Feb 11 '17 at 15:01
  • The issue of existing files in the directory is irrelevant. The probability of a collision when you have 1000 existing files is exactly the same as the probability of a collision if you start with an empty directory and invoke the function 1000 times to create 1000 files. – Mike Nakis Feb 11 '17 at 15:02
  • 1
    @MikeNakis: 1) I have already wrote in my question that this question have been already asked. Just read it. 2) You are wrong. The probability for a collision for an empty directory is different for the probability for a collision when the dir already have 1000 files. – Bohoo Feb 11 '17 at 15:18
  • Probability of a collision is = (number of files in driectory)/32^11. – jdweng Feb 11 '17 at 15:57
  • This question is imprecise. It's not entirely clear to me what **exactly** you want to calculate. Stoch. distribution, some stoch. process? What exactly is the task? One example emphasizing this: If you already got **N files** and *assume these are different* (because something would probably broken if not), the probability of a collision when adding **one new file** is just ```1-((32^11 - N) / 32^11) = N/(32^11)``` (as jdweng stated). Because of the assumption, the birthday-paradox is irrelevant here. Maybe you are interested in this result, maybe you are not. It's hard to read. – sascha Feb 13 '17 at 12:26

1 Answers1

2

Answering my own question, after help from a guy at math stackexchange:

To calculate the probability for a collision, use the Birthday Problem forumula, but with 32^11 instead of 365. The probability is:

1 - exp( (-n^2) / (2*32^11) )

where n is the number of files that were already generated with Path.GetRandomFilename.

You can easily calculate the probability using sites like WolframAlpha.

For ex., if n=1000, paste the following text into WolframAlpha search box:

1-exp((-1000^2)/(2*32^11))

Which will give you about 1.38 * 10^(-11).

Bohoo
  • 1,099
  • 11
  • 25