3

First, I have read a few posts about this, like this one: Postgresql: UUID or SEQUENCE for primary key?

My question is quite simple: my IDs in my table are UUID v4 (created in Rails or from an iOS app). As UUID by default is unique, can I remove the primary key on ID and just add an index on it? The main (and uniq?) goal is to save time (a few ms) on inserting (PostgreSQL won't have to verify if the ID is already used) at each insert.

Is-it a good choice ? Or do I keep the PK to add another verification of the uniqueness before inserting ?

For info, the table will manage maybe 10 millions records.

alex.bour
  • 2,842
  • 9
  • 40
  • 66
  • 1
    Theoretically, a table without a PK constraint has no meaning. (BTW: I have a drawer with unique socks ...) – joop Nov 01 '18 at 11:07

2 Answers2

2

First: UUIDs are not really unique. But the chance to generate double values is really really low (How unique is UUID?).

But there are some other issues with UUID. UUIDs are made for exchange data between different points. So if you think of two databases which communicate both would share the same data sets with the same UUID. Now think about an archive were data sets from many sources are stored. You could have data sets with the same UUID from some old communications.

So it depends on your current (and maybe future possible?) use cases if this could create any problems.

Furthermore I am not sure if it creates any advantages against a simple integer value concerning the space of your primary key index. Notice that every primary key automatically creates an internal index per default (so there's no need to create a separate index nonetheless). So a primary key index for an integer column might be smaller and faster.

S-Man
  • 22,521
  • 7
  • 40
  • 63
  • I need this ID as UUID field as it's used for a sync engine (pull, push) from many devices. I can't use a simple integer instead. And I can't use an ID set as integer as PK, and another field UUID just for the sync engine because it won't be possible to use relations with other tables on ID fields on a "global" system. – alex.bour Nov 01 '18 at 11:15
  • I didn't mean that you should replace the UUID but don't use it as PK column. I meant you should add an simple integer column as PK (smaller index, safe concering doubled UUID rows in certain communication logs or archives, etc) – S-Man Nov 01 '18 at 20:39
  • UUIDs do work well for syncing data between sources. But I wouldn't say they are "made" for it. It can solve for multiple problems. Once such issue is an unsigned integer index that rolls over at about 4M records. – Michael Aug 06 '20 at 01:30
2

Both of the keys that you are describing are apparently being used as surrogate keys. Surrogate meaning that they are not derived from the incoming data, and therefore have no relationship to it other than providing uniqueness.

You do not need 2 keys for the purpose of providing uniqueness, so the answer to your question is that you can drop one or the other of the keys. The size of the table is not really a factor here, as uuid_v4() will provide uniqueness for vastly larger datasets than 10M rows.

Having 2 keys for uniqueness is not just unnecessary, it is also a bottleneck. Both values must be created at insertion time, and both must be validated for uniqueness. Deleting one of them is a clearly better practice.

Kirk Roybal
  • 17,273
  • 1
  • 29
  • 38
  • I don't have 2 keys, but just one: my ID set with an UUID type. My question was just to set a PK on this field OR an index (not uniq as UUID is uniq by default), to save time when inserting. – alex.bour Nov 01 '18 at 11:13
  • a primary key implies that an index would be created anyway for enforcement – Kirk Roybal Nov 02 '18 at 09:15
  • @KirkRoybal I am using mySql and I have both primary key of id(Long) and uuid fields. The uuid field is the identify of the row actually being communicated between tables and applications. I am thinking get rid of the id field and using uuid field as a primary key. Does this make sense? – user3123690 Jan 31 '19 at 22:03