2

Isn't this poor design?

It seems to me that your primary key for a table, especially a common one like Users, should be an integer because indexing will (presumably) be much faster?

Does MSSQL do something special behind the scenes to make indexes on strings fast?

I seen some other posts about how to change the PK on users to an int, so I assume that other people have similar performance concerns. At the same time, MS has a ton of people who just sit around all day and think this stuff up, so maybe I'm missing something?

What benefits do GUIDs give you over ints? The only one that I can think of is that you can create a new GUID without fear of collision or special counter logic - however it seems like if you mark the column as identity in your schema that the database would auto-increment this for you. Thus it would only be a benefit for sites with enough users that the User DB needs to be sharded.

John Shedletsky
  • 7,110
  • 12
  • 38
  • 63
  • GUIDs are really useful in the case where your application is setup on multiple DBs. Ints wont work in a distributed environment. Anyway, this question was already answered by one of the ASP.NET team. http://stackoverflow.com/questions/19238621/why-do-the-asp-net-identity-interfaces-use-strings-for-primary-and-foreign-keys/19283502#19283502 – jamesSampica Oct 22 '14 at 05:43
  • And you'll reach that point much faster if you use GUIDs for your PKs everywhere. I still think it's a poor design choice. – John Shedletsky Oct 22 '14 at 21:48
  • I'm not sure what you mean. A userId of "2" in Database A could mean something different than a userId of "2" in Database B. That creates a problem when you have an application that uses both databases for auditing or aggregates the data in a meaningful way. GUIDs solve that because you will never have a primary key collision in either database. It isn't a poor choice under those circumstances. – jamesSampica Oct 23 '14 at 01:47
  • In principle, I don't believe performance should be sacrificed just to make it slightly easier on the programmers. Using a string GUID is making your PK 38 bytes long, this is a serious hit. – John Shedletsky Oct 23 '14 at 17:53
  • It isn't just "slightly easier for programmers" though. Lets say you have two `Users` tables across two DBs. You want to take both those tables and join them on a `Products` table in a third DB. How are you going to make that join? You can't because your primary key column has rows with duplicates. Now you have no primary key. You have no way to join those tables. Space isn't that big of a deal anymore (and guids are just 16 bytes compared to 4) and COMBs can provide better paging performance. – jamesSampica Oct 23 '14 at 18:30
  • Is this in production? If so, I would have each sharded user DB use a different range of ints for their PKs. However, I do not think there is a website on the planet that needs to shard their User DB, as a couple billion records is just not a lot. – John Shedletsky Oct 23 '14 at 20:19
  • Okay, so you've sharded your DB. Now 6 months later management wants another duplicate production DB located in South America. Now you have to update and reshard your other DBs so that the new DB gets its own range, making sure you don't have collisions and hopefully creating nice and even int ranges. Repeat a couple more times over a few years for effect and maybe now you see why GUIDs are better for this. – jamesSampica Oct 23 '14 at 20:41
  • There's a lot of namespace in a 64 bit integer ;) – John Shedletsky Oct 24 '14 at 22:08

1 Answers1

0

Using an int would be faster. Id of IdentityUser is a string so as a convinient solution, you can subclass it and out of the box use int, Guid or for example MongoDB ObjectId (stored as a string, of course). This is a confortable solution, although not optimal performance-wise. However, if you wish you can change it to real int. Here is how: http://www.asp.net/identity/overview/extensibility/change-primary-key-for-users-in-aspnet-identity

MS SQL doesn't do anything special. String is only used due to the aforementioned reason.

You can refer to this article in regards of differences between int and Guid primary keys in database: http://blog.codinghorror.com/primary-keys-ids-versus-guids/

Adam Szabo
  • 11,302
  • 18
  • 64
  • 100
  • The Atwood article you link to suggests that using GUIDs as a PK incurs a 10x perf hit on fetching records. – John Shedletsky Oct 21 '14 at 21:55
  • Are you suggesting that the default class using GUIDs makes it easier to subclass? – John Shedletsky Oct 21 '14 at 21:55
  • Yes, easier to subclass it because the default class is using string, where you can store anything (however not too efficiently): number, guid, other kind of alphanumeric id. If you are likely to have so many users that you need to optimize, then you can refer to the other article I linked and go for a "real Guid" implementation, and use sequential Guids for max performance. Some benchmarks regarding sequential Guids: http://www.informit.com/articles/printerfriendly/25862 and http://stackoverflow.com/questions/170346/what-are-the-performance-improvement-of-sequential-guid-over-standard-guid – Adam Szabo Oct 21 '14 at 22:38