3

Why are UUID's used as primary keys? They require a lot of memory to store. From my understanding, they are used as primary keys because every generated value is unique, which is helpful when merging 2 databases since there would be no collisions.

If UUID's are used for privacy of API endpoint guessing, why not instead have, ID, database no. and then hash them both using that as the primary key? Then the ID can be used for relationships, which is convenient. The hashed value of both ID and database no. can be used for primary_key, the searching of data entries using ID is also faster than UUID.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
  • *"Why not instead have, ID, database no. and then hash them both using that as the primary key?"* because there can be duplicates this way – Cid Dec 20 '19 at 08:44
  • 5
    UUIDs require 16 bytes of storage if you store them binary, which isn't 'a lot'. Most hashes have similar if not larger storage requirements. – Mark Rotteveel Dec 20 '19 at 08:45
  • What about the searching penalty of UUID's? –  Dec 20 '19 at 08:46
  • 2
    What penality ? the search complexity of a B-Tree index (the most common case, AFAIK) is O(log N), whatever the type – Cid Dec 20 '19 at 08:47
  • Are you referring to the end value of hashing database no.*ID which is duplicated or the database no. value is duplicated? –  Dec 20 '19 at 08:48
  • Whether you store your UUIDs as text or binary, you should be able to index such a column, which can have generally good performance. – Tim Biegeleisen Dec 20 '19 at 08:48
  • What about the insert performance? –  Dec 20 '19 at 08:50
  • @Liability2020 Yes, hashes can be the same for many differents elements to hash – Cid Dec 20 '19 at 08:50
  • 1
    Does this answer your question? [Advantages and disadvantages of GUID / UUID database keys](https://stackoverflow.com/questions/45399/advantages-and-disadvantages-of-guid-uuid-database-keys) – Basil Bourque Dec 20 '19 at 09:38

1 Answers1

9

Universally Unique IDs (UUID) are used primarily for:

  • Generating identifiers in various systems without requiring coordination with a central authority (such as a sequence number generator in a database server).
  • Sharing data records between systems, such as federated databases, without risk of collision.

Your point about privacy of API endpoint guessing was not a design goal originally as far as I know. But it may be another feature given contemporary URL-based access methods.

A UUID is actually a 128-bit value. Do not conflate this with the 36-character hex string usually used for display to humans.

For databases supporting UUID as a data type, such as Postgres, a key using a data type of UUID is using twice the memory and storage space of a 64-bit integer sequence number, or four times a 32-but integer (4 billion range), that are the usual other choice for a surrogate key. Whether that is “a lot” is a judgement call for each database designer to make. Given how inexpensive and readily available both memory and storage space is in modern computing hardware, I generally consider the benefits of UUIDS to be well worth the extra memory/storage.

If you need a universally unique id, use a Universally Unique ID (UUID). No need to invent your own.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154