3

Possible Duplicates:
Surrogate Vs. Natural/Business Keys
When not to use surrogate primary keys?

So, I would prefer having an INT as a primary key with AI. But, here is the reason why I am considering making a unique string as a primary key: I don't have to query related tables to fetch the primary key since I will already know it.

For example:
I have a many to many relation:

Customer - Order - Product

Let's say I want to add a new customer and a new order, and I already know what they bought. I have to do a query on product table to get the INT, but if I have the string (unique) that is a primary key I don't have to do the query (this seems cleaner to me, I am not talking about optimization/run-time-speeds, not at all).

Community
  • 1
  • 1
Zombies
  • 25,039
  • 43
  • 140
  • 225
  • 1
    This has been asked many times before. Just search "surrogate". My best bet: http://stackoverflow.com/questions/63090/surrogate-vs-natural-business-keys – Per Wiklander Feb 13 '10 at 19:59
  • This could be closed as a duplicated, but it is a little more specific since I am asking about which is better for a many to many relation! – Zombies Feb 13 '10 at 20:19

3 Answers3

5

If you are not worried about optimization, the main 2 criteria for primary key are:

  • Uniqueness

  • Constanteness (never changes)

E.g. if your problem domain is - and will always be - such that 2 product names are always distinct, AND that no product will ever change its name (think Norton Antivirus -> Symantec Antivirus for a simple example of name change), then you may use the product name as the unique key.

The two MUST be 100% true not only today, but for any foreseeable future lifetime of the database.

Therefore using numeric ID is highly recommended as you may not always be able to forecee such things - and changing the DB structure later on to have a product ID is of course orders of magnitude worse than a minor inconvenience of needing to map and ID from the name in your queries.

DVK
  • 126,886
  • 32
  • 213
  • 327
  • Check and check. I am wondering though if anyone has done this and wish they hadn't. – Zombies Feb 13 '10 at 19:56
  • 2
    I have, and I have regretted it once my functional spec changed so that the key needed to mutate. Epic FAIL :) – DVK Feb 13 '10 at 20:03
3

If you can guarantee that your VARCHAR field is indeed unique and hopefully stable (not changing), then you can definitely use it as a primary key without any conceptual problems.

The only real reasons against using it as your primary key (or even more importantly: your clustering key in SQL Server) are indeed performance-based. A wider and varying size clustering key is suboptimal in many ways, and affects not just your table and its clustering index, but also all non-clustered indices on that table. But if that's none of your concern, again - you'll be fine with a VARCHAR as your primary key.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • Well, let me ask this... going back to the many to many example. Would it be considered bad since I am duplicating data/actual content? I suppose because it doesn't change then it isn't wrong... It just feels wrong. – Zombies Feb 13 '10 at 20:16
  • Yes, obviously, if you use natural keys (e.g. "real" data) for your keys, you'll have a certain duplication when you use those keys in many-to-many tables. The same applies to surrogate INT keys - it just might be less obvious and less "bothering". – marc_s Feb 13 '10 at 20:29
0

Er... both are correct in a way

  • Your logical model and design will use the unique string. This is the natural key.
  • The actual implementation may use a numeric auto number column (surrogate key) because of architecture/performance
gbn
  • 422,506
  • 82
  • 585
  • 676
  • 1
    Plausible, but unlikely. The physical model is derived from the Logical one - IME, the only columns not displayed in the Logical model are foreign keys. Switching from natural to artificial key is not a trivial model change. – OMG Ponies Feb 13 '10 at 20:27
  • @OMG Ponies: If you use Object Role Modeling, then the design will never use a surrogate key. When you come to implement it, you will add a surrogate key. I think either you misunderstand me or I've used the wrong terms. – gbn Feb 14 '10 at 09:16
  • IME (Oracle Designer, SQL Server SSMS doesn't do Logical models), you do define the the pk, be it natural or artificial, in the logical model. What are you using that does Logical models for SQL Server? – OMG Ponies Feb 14 '10 at 18:37
  • @OMG Ponies: I'd use something like NORMA: http://www.orm.net/ http://sourceforge.net/projects/orm/ – gbn Feb 14 '10 at 18:50
  • Took a look, thx - it's similar to the Logical ERDs in Oracle, but the key difference is that attributes aren't defined in ORM. Oracle Designer logical models include both attributes and their data types. – OMG Ponies Feb 14 '10 at 19:25