11

For my project, which is a potentially big web site, I have chosen to separate the command interface from the query interface. As a result, submitting commands are one-way operations that don't return a result. This means that the client has to provide the key, for example:

service.SubmitCommand(new AddUserCommand() { UserId = key, ... });

Obviously I can't use an int for the primary key, so a Guid is a logical choice - except that I read everywhere about the performance impact it has, which scares me :)

But then I also read about COMB Guids, and how they provide the advantages of Guid's while still having a good performance. I also found an implementation here: Sequential GUID in Linq-to-Sql?.

So before I take this important decision: does someone have experience with this matter, of advice?

Thanks a lot!

Lud

Community
  • 1
  • 1
L-Four
  • 13,345
  • 9
  • 65
  • 109

3 Answers3

7

First of all, I use sequential GUIDs as a primary key and I don't have any problems with performance.

Most of tests Sequential GUID vs INT as primary key operates with batch insert and selects data from idle database. But in a real life selects and updates happen in SAME time.

As you are applying CQRS, you will not have batch inserts and burden for opening and closing transactions will take much more time than 1 write query. As you have separated read storage, your select operations on a table with GUID PK will be much faster than they would be on a table with INT PK in a unified storage.

Besides, asynchrony, that gives you messaging, allows your applications scale much better than systems with blocking RPC calls can do.

In consideration of aforesaid, choosing GUIDs vs INTs seems to me as be penny-wise and pound-foolish.

xelibrion
  • 2,220
  • 1
  • 19
  • 24
  • Thanks. But, actually, I'm not planning to use a full-blown CQRS (see my other post: http://stackoverflow.com/questions/6917843/architecture-simple-cqs). Does this influence your comment? – L-Four Aug 03 '11 at 06:27
  • 1
    A little bit. You will use DDD principles anyway and so you still will not have batch insert/update operations. And if you want to do insert operations in async manner, you don't have other choice than using GUIDs as primary key. Of course this will create some (minor) performance overhead for a database, but ability to simply scale your application matters. – xelibrion Aug 03 '11 at 08:37
  • Don't forget to save processed messages to have ability to migrate to separated read storage – xelibrion Aug 03 '11 at 08:41
6

You didn't specify which database engine you are using, but since you mentioned LINQ to SQL, I guess it's MS SQL Server.
If yes, then Kimberly Tripp has some advice about that:

To summarize the two links in a few words:

  • sequential GUIDs perform better than random GUIDs, but still worse than numeric autoincrement keys
  • it's very important to choose the right clustered index for your table, especially when your primary key is a GUID
Christian Specht
  • 35,843
  • 15
  • 128
  • 182
  • yes, it's sql server! But if I want to use query/command pattern, do I really have another choice than using (sequantial) guids? – L-Four Aug 03 '11 at 07:29
  • 1
    @Lud: no, you haven't because you need generate id before send a command. And if it will potentially big site you for sure need to use unique id, because of you will need distribute your system in the future. – Andrew Orsich Aug 03 '11 at 23:10
5

Instead of supplying a Guid to a command (which is probably meaningless to the domain), you probably already have a natural key like username which serves to uniquely identify the user. This natural key make a lot more sense for the user commands:

  1. When you create a user, you know the username because you submitted it as part of the command.
  2. When you're logging in, you know the username because the user submitted it as part of the login command.

If you index the username column properly, you may not need the GUID. The best way to verify this is to run a test - insert a million user records and see how CreateUser and Login perform. If you really to see a serious performance hit that you have verified adversely affects the business and can't be solved by caching, then add a Guid.

If you're doing DDD, you'll want to focus hard on keeping the domain clean so the code is easy to understand and reflects the actual business processes. Introducing an artificial key is contrary to that goal, but if you're sure that it provides actual value to the business, then go ahead.

Josh Kodroff
  • 27,301
  • 27
  • 95
  • 148
  • 1
    This is an interesting idea for users; indeed, user name is unique. Seems like a good solution. Would you do the same for an article, which has a unique title (urltitle)? What about an AddCategoryCommand, what would be the key there? – L-Four Aug 03 '11 at 17:55
  • For a category, I'd go with the name. You might want to formulate a slug instead which cannot be changed, much like the "cqrs-and-primary-key" part of the URL of this question. (I don't know if that's truly a slug - you'd have to try changing the title of the question and seeing what happens.) – Josh Kodroff Aug 03 '11 at 20:58
  • In my project, for some commands, I need to execute the command, then execute a query to get back a more succinct identifier. For example, my orders have an identity column, but they also have a natural key of (customerId, referenceNumber) so I have a query method that returns the order id (the identity int) for a given customerId and referenceNumber. – Josh Kodroff Aug 03 '11 at 21:00
  • it seems I can just change the title of this post. So probably it's not the key... – L-Four Aug 04 '11 at 06:03
  • Yeah, so the concept of a slug is still probably what you want but the stack overflow urls are not a good example. :) However, the slug pattern is definitely used out there somewhere - it's just a string ID derived from the title of an article. Maybe WordPress uses them. Is the concept clear? – Josh Kodroff Aug 05 '11 at 19:52
  • yes it's clear, the only thing that is left is to understand is what the fifference is between such a string and a guid (performance)? Is a string better for performance then? – L-Four Aug 08 '11 at 07:55
  • Only one way to find out - create millions of random strings and millions of guids and see what happens! Also, make sure you index the column but that shouldn't be an issue because it's your primary key. – Josh Kodroff Aug 09 '11 at 21:28