String as primary KEY?

Question

I have this database structure

CREATE TABLE `productinfo` (
  `ProductID` int(11) NOT NULL AUTO_INCREMENT,
  `ProductName` varchar(255) NOT NULL,
  `ProductImage` varchar(255) NOT NULL,
  `CategoryID` int(11) NOT NULL,
  `SubCategoryID` int(11) NOT NULL,
  `ProductBrief` varchar(255) NOT NULL,
  `Features` text NOT NULL,
  `Specifications` text NOT NULL,
  `Reviews` text NOT NULL,
  `Price` varchar(255) NOT NULL,
  `Status` tinyint(4) NOT NULL,
  PRIMARY KEY (`ProductID`)
) ENGINE=MyISAM AUTO_INCREMENT=12 DEFAULT CHARSET=latin1;

I now I need to turn ProductID, CategoryID, and SubCategoryID into a string like Ps-5678 for the part number. ProductID is the primary key so how do i change the structure of the database. CategoryID, and SubCategoryID are primary keys in other tables so how do i handle this..is it as easy as turning

  `ProductID` int(11) NOT NULL AUTO_INCREMENT

into a string..and getting rid of

  PRIMARY KEY (`ProductID`)

ideas, suggestions anyone

Especially if you need to deal with ORMs, it may be easier just to leave the exiting PK and use a unique covering index on the "pretty string product ID" (which actually sounds kind of just like a denormalization of Category, SubCat, etc...) — , Nov 03 '10 at 23:15

score 8 · Accepted Answer · answered Nov 03 '10 at 23:15

8

Primary keys are for the database.

Display names are for end users.

Do not confuse one with another! Don't make a primary key out of something that has a meaning. You will regret it sooner or later.

Having a surrogate key / identity / autonumber as a primary key is a very good idea and is used widely in database design.

You can add a column or even a DERIVED COLUMN and add a unique constraint on it.

answered Nov 03 '10 at 23:15

Raj More

47,048
33
131
198

can you give me an example of surrogate key / identity / autonumber as a primary key – Matt Elhotiby Nov 03 '10 at 23:17
@Matt: You don't need an example - that's exactly what you did first! Just forget about your intention of turning ProductId into a varchar column... In fact, since this "new field" is just a calculated expression, you don't even need to create a column with this value. Just concatenate the underlying fields before showing. – rsenna Nov 03 '10 at 23:20
1

-1 Creating a derived column on a surrogate key undermines the value of having a surrogate at all. You should enforce the uniqueness constraint on the meaningful business key independently of any surrogate key you use. Whether or not you *also* need a surrogate is a separate decision. – nvogel Nov 04 '10 at 07:43
@dportas, no it doesn't. Joining on an integer PK saves database speace and more importantly is faster than joining on a string value. So maintaining the surrogate while creating a display value is valid choice and does not undermine the surrogate at all. – HLGEM Nov 04 '10 at 13:24
1

@HLGEM: one benefit of using a surrogate is that it makes it simpler to update the natural key column if desire but that benefit is lost if the natural key value depends on the surrogate. That was what I was referring to. Saving space is probably an advantage in this case but ought to be weighed against the disadvantage of making the surrogate meaningful by exposing it to users. – nvogel Nov 04 '10 at 16:42
1

-1 Using surrogates as primary keys may be widely used, like macnuggets, but it is wrong. In order to get the power of relational dbs, you need strong identifiers, WITH meaning. I am not saying "never use surrogates", I am saying use surrogates only when you have to, not blindly on every table. – PerformanceDBA Nov 06 '10 at 06:44
1

@HLGEM: that's incorrect. Of course it would be stupid to use VAR anything as a key. But strings (say size as the numeric being compared) are just as fast as numerics. Ints vs CHAR(4). Having the PK dependent and derived from a surrogate is totally absurd. – PerformanceDBA Nov 06 '10 at 06:49
@PerformanceDBA I agree that there is some place in a DB for a key with a meaning - a Calendar table has an INT key where I would use the date value as the key in YYYYMMDD format - but this type of usage is pretty rare for me. – Raj More Nov 08 '10 at 14:21
1

@Raj: The first two stmts in your post are simply incorrect. In a good Rdb, with meaningful keys, plus a few *additional* surrogates (to avoid carrying wide keys into the children), the key, primary or not, has meaning, and is for display. (It does not matter what is rare for you, it matters that you give valid advice in your post;<). – PerformanceDBA Nov 10 '10 at 11:38
@PerformanceDBA Note that I made the statement about PKs not just any key. I have yet to encounter a significant situation where a meaningful PK has never-ever-ever been altered down the line. Even in my own example of a calendar table, we went from needing just the date to Date+Time at a 15 min interval (20101110 vs 20101110130000, 20101110131500...). – Raj More Nov 10 '10 at 18:16
@Raj: Understood. No confusion first time, no confusion now. The point is, the statements are **technically** incorrect. I have no comment on what you have encountered. I have worked on hundreds of databases, about 25 real Relational Dbs with no such madness; about 80 correcting such madness. Understood that requirements change. – PerformanceDBA Nov 10 '10 at 21:13
@PerformanceDBA I'm sure that **technically** you can even use a phone number as a primary key for a Employees table if you really really want to prove a point. I wouldn't. – Raj More Nov 11 '10 at 15:18
@Raj: Keep your straw men to yourself and stick to the point, the question, yoru incorrect answer. – PerformanceDBA Nov 12 '10 at 23:12

score 1 · Answer 2 · answered Nov 03 '10 at 23:05

1

I believe it should be that easy. however you need to determine what string type you want to use http://dev.mysql.com/doc/refman/5.1/en/string-types.html

answered Nov 03 '10 at 23:05

score 1 · Answer 3 · answered Nov 12 '10 at 23:11

Your requirement is unclear. How do you get "PS-5678" fro 3 int columns ? There is only 2 components in your example.

Do you just need to CONVERT the 3 INTs to a single CHAR() string ?

If so, the database is fine, no need to change the table at all !?!?! The three components are already available, correctly seperated, as distinct columns. What you are looking for is merely DISPLAYING the three components as a single string.

score 0 · Answer 4 · answered Nov 04 '10 at 13:21

It's not clear to me from your question just what you do to product, category, and subcategory to make your part number. I'll assume for the sake of argument that you are concatenating them together, like product 123, category 456, subcategory 789 gives part number 123-456-789 or some such.

I like to use natural identifiers as primary keys whenever practical. But "whenever practical" can be a serious constraint. If your natural identifier is derived by somehow combining three other fields, you have four choices:

Make the primary key be the combination of these three fields. This tends to be a pain. All your joins then must match on three fields, searches must test three fields, etc.
Create a new field that is the concatenation of the three fields, and use this as the priamry key. Then whenever one of the "basic" fields changes, also change this concatenated field. This is a very, very bad idea. Don't do it. It's redundant data, with all the bad things that come from redundant data.
Replace the three separate fields with one combined field. This is worse than #2. Now when you need the individual values, you have to take the field apart.
Give up and create a synthetic key, like a serial number. Use this as the primary key, and then just use the natural key for display purposes. If my natural key requires concatenating or otherwise manipulating three fields, I tend to go with this option.

String as primary KEY?

4 Answers4

Linked