0

I have a primary key in my SQLDB as BIGINT (which is 64bit) Now since C# doesnt have a mechanism to generate 64bit key directly I used the following code for generation of keys. My problem is that I'm seeing random but clustered values in my key column.

tl.ID = (this.r.Next() << 32) | this.r.Next();

Where tl.ID is a LINQ to SQL Class and is a LONG. this.r is instance of random

Random r = new Random((int)DateTime.Now.Ticks);

Some of the values seen in DB:

MAX:2147483647
Top few values:
2147478265
2147478479
2147478526
2147479034
2147479663
2147480695
2147480783
2147480887
2147480984
2147481817
2147482099
2147482607
2147483321
2147483391
2147483558
2147483644
2147483647
confusednerd
  • 138
  • 3
  • 17
  • Why not just have auto increment enabled for primary keys in database? – danish Dec 10 '14 at 10:06
  • I dont think thats possible once the DB has been created. – confusednerd Dec 10 '14 at 10:06
  • Not a duplicate, becuase I'm using the same technique mentioned in the post. Create an item and shift by 32 bits. My problem is that most of these generated values are very close to each other. Is there a problem with some cast ? – confusednerd Dec 10 '14 at 10:13
  • @confusednerd, You can make a field an identity after the fact, you just have to ensure that your initial seed is higher than the highest existing value. Also not sure why you think a key needs to be random, it just needs to be unique and if it is sequential then your indexes will perform better. – Ben Robinson Dec 10 '14 at 10:13
  • @confusednerd *It's probably worth it to generate 4 16-bit integers rather than 2 32-bit ones to avoid signed-unsigned problems. But at this point the solution loses its elegancy, so I think it's best to stick with Random.NextBytes version [...] It looks pretty well in terms of value distribution (judging by very simple tests I ran)* –  Dec 10 '14 at 10:22

4 Answers4

2

A primary key first of all should be unique.

Add an identity column to your database / auto increment and you will be fine.

If needed, add the new identity column with auto increment, drop the old one, rename the new one to the old name and you will be fine

Pleun
  • 8,856
  • 2
  • 30
  • 50
  • There's at least one justified reason, I can think of: if you need to generate the PK in your domain (not sql), to build up hierarchies (like parent-child) in bulk. I would rather go for a GUID in this scenario though. –  Dec 10 '14 at 10:27
  • Well yes, but in that case the Random part in the question seriously confuses me. – Pleun Dec 10 '14 at 11:07
  • 1
    Well, if you do not have a pool to increment from the next *best chance* is to do it randomly ;) Which is by no means a correct solution... –  Dec 10 '14 at 11:16
1

You have to cast the number to a long before shifting, otherwise this will just shift out the significant bits.

tl.ID = ((long)this.r.Next() << 32) | this.r.Next();

Just keep in mind that this method isn't perfect, as pointed out in the question mentioned in the comments, since Random.Next() only returns positive values, so you actually only get 62 random bits.

AlexDev
  • 4,049
  • 31
  • 36
  • Brilliant! This was the problem! – confusednerd Dec 10 '14 at 10:22
  • This method can only generate 2^32 many ids, so you might as well cast 1 `int` to `long` – weston Dec 10 '14 at 10:29
  • Why 2^32? I was going to point out now that it can generate 2^62 unique ids because r.Next only returns positive values. – AlexDev Dec 10 '14 at 10:34
  • `Random` is deterministic. There are 2^32 possible from the first call, but the second call is entirely predictable and will always return the same value following the first. I.e so if the first call to `Next()` produces `1589351479` the next `Next()` call is always `636837953` – weston Dec 10 '14 at 10:43
  • @weston That's the first time I hear that. Do you have any sources for that? Is this flaw inherent to Random.Next() or would it apply also to a solution using Random.NextBytes? – AlexDev Dec 10 '14 at 10:50
  • Try it: `var rand = new Random (123); Assert.AreEqual (1589351479, rand.Next()); Assert.AreEqual (636837953, rand.Next());` – weston Dec 10 '14 at 10:52
  • And yes it would apply to a nextbytes solution. It's because the seed for the next item is based on the last random value. – weston Dec 10 '14 at 10:54
  • Well obviously if you use a fixed seed you are always going to get the same sequence, then you get 2^0, not 2^32. What you stated in the first comment implied that the Random.Next function is periodical, which is something completely different. – AlexDev Dec 10 '14 at 10:59
  • It's not about the fixed seed, that was to show it is deterministic. Another way to think about it: There are only 2^32 states the `Random` class can be in before your method is called. Therefore there can only be 2^32 possible outcomes. – weston Dec 10 '14 at 11:00
  • No new random entropy is introduced. It relies entirely upon a 32bit seed which is updated (in a predictable way) each time it generates a random value. – weston Dec 10 '14 at 11:03
  • run this on linqpad. You'll see that 1796769800 appears twice, once followed by 321286284 and once followed by 1265044085.var rand = new Random (123); var used = new HashSet(); while(true) { int a = rand.Next(); a.Dump(); if(used.Contains(a)) break; used.Add(a); } rand.Next().Dump(); – AlexDev Dec 10 '14 at 11:14
  • What does that prove/show? – weston Dec 10 '14 at 11:16
  • It show that "There are 2^32 possible from the first call, but the second call is entirely predictable and will always return the same value following the first" does not happen, at least not in my currently installed framework. So from a pseudo-random perspective, you get 64 bits of randomness. If you are talking about real, cryptographic randomness, you actually have much less than 32 bits of entropy. – AlexDev Dec 10 '14 at 11:22
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/66566/discussion-between-weston-and-alexdev). – weston Dec 10 '14 at 11:23
1

As mentioned by AlexDev, the problem was in the first case, before shifting it by 32 bits it should be converted into a long.

confusednerd
  • 138
  • 3
  • 17
1

Problem 1: Using a random number to generate a primary key is probably a bad idea as eventually you'll end up with clashes, so with the code you have you will run into problems unless you check that every generated key is unique prior to using it. If you really need random numbers as keys use the sollution below.

Problem 2: The Random class method Next generates an int, which is 32 bit in your case, this will wrap around and you'll only have half the range intended. Another problem is that the standard Random class is not that a good random generator.

Problem 3: Using two Int32 generated by Random and shift one (into a long) and append another is not the best way to generate a 64 bit random number, your chances on doubles rise (this is a crypto topic that is beyond this discussion so I'll leave it at that :))

Solution: To really generate a random Int64 you can use the Random or even better the RandomNumberGenerator to fill an array of 8 bytes for your 64bit number and convert that array to long (Int64) e.g.:

using System.Security.Cryptography;

var rng = RandomNumberGenerator.Create();
//bytes to hold number
var bytes = new byte[8];
//randomize
rng.GetNonZeroBytes(bytes);
//Convert
long random = BitConverter.ToInt64(bytes, 0);

This will generate a true 64 bit random number, the RandomNumberGenerator is a better random generator (less predictable) than the Random class.

Tom
  • 151
  • 1
  • 6
  • I'm not sure if it's an extra problem point or already covered, but as `Random` works on 32bits there can only be 2^32 possible outcomes. So you can only generate 2^32 longs by any concat method. – weston Dec 10 '14 at 10:35
  • It's already covered in **Problem 2** :) Random does have a method NextBytes, which like RandomNumberGenerator allows you to fill an array of bytes with random numbers, so you could use it. But RandomNumberGenerator is a better 'more random' generator. E.g. it is recommended for cryptographic purposes. – Tom Dec 10 '14 at 10:41
  • Ah no, if you use next 8 Bytes that still produces only 2^32 unique combinations. – weston Dec 10 '14 at 10:45
  • Another way to think about it: There are only 2^32 states the Random class can be in before the method is called. Therefore there can only be 2^32 possible outcomes. Regardless of it you use nextbytes or nextint. – weston Dec 10 '14 at 11:01
  • I do not agree with that last statement, a byte consists of 8 bits, so 2^8 combinations. However if you use 8 bytes, you will have 2^64 possible outcomes, not 2^32. (256x256x256x256). – Tom Dec 10 '14 at 12:18