-1

I'm writing a program which basically processes data and outputs many files. There is no way it will be producing more than 10-20 files each use. I just wanted to know if using this method to generate unique filenames is a good idea? is it possible that rand will choose, lets say x and then within 10 instances, choose x again. Is using random(); a good idea? Any inputs will be appreciated!

Random rand = new Random ();
int randNo = rand.Next(100000,999999)l

using (var write = new StreamWriter("C:\\test" + randNo + ".txt") 
{
// Stuff
}
sparta93
  • 3,684
  • 5
  • 32
  • 63
  • Are you storing these file forever or just temporarily? You are going to get a duplicate number eventually if it's forever. – FGhilardi Jun 18 '15 at 15:20
  • This question is going to generate a lot of opinion based answers. As such, it isn't a good SO question. – Bill Gregg Jun 18 '15 at 15:20
  • How fast are you creating the files? If there's a reasonable delay between each write, you could use some kind of timestamp instead. – vvye Jun 18 '15 at 15:21
  • 6
    See [`Path.GetRandomFileName()`](https://msdn.microsoft.com/en-us/library/system.io.path.getrandomfilename(v=vs.110).aspx). – piedar Jun 18 '15 at 15:22
  • http://stackoverflow.com/questions/21684696/will-path-getrandomfilename-generate-a-unique-filename-everytime – Steve Jun 18 '15 at 15:25
  • does it have to be random? could you use a long instead and just iterate? either that or use GUIDs, as some of the answers have stated – user1666620 Jun 18 '15 at 15:26

6 Answers6

5

I just wanted to know if using this method to generate unique filenames is a good idea?

No. Uniqueness isn't a property of randomness. Random means that the resulting value is not in any way dependent upon previous state. Which means repeats are possible. You could get the same number many times in a row (though it's unlikely).

If you want values which are unique, use a GUID:

Guid.NewGuid();

As pointed out in the comments below, this solution isn't perfect. But I contend that it's good enough for the problem at hand. The idea is that Random is designed to be random, and Guid is designed to be unique. Mathematically, "random" and "unique" are non-trivial problems to solve.

Neither of these implementations is 100% perfect at what it does. But the point is to simply use the correct one of the two for the intended functionality.

Or, to use an analogy... If you want to hammer a nail into a piece of wood, is it 100% guaranteed that the hammer will succeed in doing that? No. There exists a non-zero chance that the hammer will shatter upon contacting the nail. But I'd still reach for the hammer rather than jury rigging something with the screwdriver.

David
  • 208,112
  • 36
  • 198
  • 279
  • The first paragraph is against the second paragraph. `Guid.NewGuid()` doesn't guarantee unique numbers. It guarantess random numbers. It uses [CoCreateGuid](https://msdn.microsoft.com/en-us/library/windows/desktop/ms688568(v=vs.85).aspx) that ***To a very high degree of certainty**, this function returns a unique value* – xanatos Jun 18 '15 at 15:27
  • 2
    @xanatos [GUIDs are designed to be unique, not random](http://blogs.msdn.com/b/oldnewthing/archive/2012/05/23/10309199.aspx). – piedar Jun 18 '15 at 15:32
  • 1
    @xanatos: True, there exists a *non-zero* chance of a collision. In the case of a GUID, a 1% chance of collision exists when generating 2.6x10^18 values. It's not *guaranteed*, but it's probably the best "random" generator to get the job done. I suppose a more guaranteed approach to uniqueness could be to take advantage of something like a SQL IDENTITY column (or roll one's own in code, but which is non-trivial) to generate the next value, which seems like overkill for the problem at hand. – David Jun 18 '15 at 15:35
  • @David Yep... Exactly as you said – xanatos Jun 18 '15 at 15:37
  • (It's also worth noting, in my previous suggestion for using SQL IDENTITY, that the odds of the database server, or whatever else you use, failing are *considerably* higher than the odds of a GUID collision.) – David Jun 18 '15 at 15:50
  • @David Small note... The collision is a little more probable... Only 122 bits of a Guid.NewGuid are "random". 6 bits are fixed. But it doesn't change anything important. – xanatos Jun 18 '15 at 15:55
4

No, this is not correct method to create temporary file names in .Net.

The right way is to use either Path.GetTempFileName (creates file immediatedly) or Path.GetRandomFileName (creates high quality random name).

Note that there is not much wrong with Random, Guid.NewGuid(), DateTime.Now to generate small number of file names as covered in other answers, but using functions that are expected to be used for particular purpose leads to code that is easier to read/prove correctness.

Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
2

There is what is called the Birthday Paradox... If you generate some random numbers (any number > 1), the possibility of encountering a "collision" increases... If you generate sqrt(numberofpossiblevalues) values, the possibility of a collision is around 50%... so you have 799998 possible values... sqrt(799998) is 894... It is quite low... With 45-90 calls to your program you have a 50% chance of a collision.

Note that random being random, if you generate two random numbers, there is a non-zero possibility of a collision, and if you generate numberofpossiblevalues + 1 random numbers, the possibility of a collision is 1.

Now... Someone will tell you that Guid.NewGuid will generate always unique values. They are sellers of very good snake oil. As written in the MSDN, in the Guid.NewGuid page...

The chance that the value of the new Guid will be all zeros or equal to any other Guid is very low.

The chance isn't 0, it is very (very very I'll add) low! Here the Birthday Paradox activates... Now... Microsoft Guid have 122 bits of "random" part and 6 bits of "fixed" part, the 50% chance of a collision happens around 2.3x10^18 . It is a big number! The 1% chance of collision is after 3.27x10^17... still a big number!

Note that Microsoft generates these 122 bits with a strong random number generator: https://msdn.microsoft.com/en-us/library/bb417a2c-7a58-404f-84dd-6b494ecf0d13#id9

Windows uses the cryptographic PRNG from the Cryptographic API (CAPI) and the Cryptographic API Next Generation (CNG) for generation of Version 4 GUIDs.

So while the whole Guid generated by Guid.NewGuid isn't totally random (because 6 bits are fixed), it is still quite random.

xanatos
  • 109,618
  • 12
  • 197
  • 280
2

If you want to generate a unique value, there's a tool specifically designed for generating unqiue identifying values, a Globally Unique IDentifier (GUID).

var guid = Guid.NewGuid();

Leave the problem of figuring out the best way of creating such a unique value to others.

Servy
  • 202,030
  • 26
  • 332
  • 449
0

I would think it would be a good idea to add in the date & time the file was created in the file name in order to make sure it is not duplicated. You could also add random numbers to this if you want to make it even more unique (in the case your 10 files are saved at the exact same time).

So the files name might be file06182015112300.txt (showing the month, day, year, hour, minute & seconds)

starsg38
  • 31
  • 9
0

If you want to use files of that format, and you know you won't run out of unused numbers, it's safer to check that the random number you generate isn't already used as follows:

Random rand = new Random();
string filename = "";
do
{
    int randNo = rand.Next(100000, 999999);
    filename = "C:\\test" + randNo + ".txt";
} while (File.Exists(filename));
using (var write = new StreamWriter(filename))
{
//Stuff
}
tsandy
  • 911
  • 4
  • 10
  • 1
    It's possible for the file to be created after you check if it exists but before you create it and take out that lock. If the OS provided some sort of `TryCreate` operation, then you could use this approach, but as is, it's not safe because it's not atomic. – Servy Jun 18 '15 at 15:28