0

I would like to create a column (not a PK) whose value represents as a unique identifier. It is not used for encryption or security purposes - strictly to identify a record. Each time a new record is inserted, I want to generate and store this unique identifier. Not sure if this is relevant, but I have 1 million records now, and anticipate ~3 million in 2 years. I'm using a web app in PHP.

I initially just assumed I'd call UUID() and store it directly as some sort of char data type, but I really wanted to do some research and learn of a more efficient/optimized approach. I found a lot of great articles here on SO but I'm having a hard time with all of the posts because many of them are somewhat older, or disagree on the approach that has ultimately left me very confused. I wanted to ask if someone more wiser/experienced could lend me a hand.

I saw folks linked here on various posts and suggested to implement things this way: https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/

but i'm having a hard time fully knowing what to do after reading that article. Ordered UUID? What should I store it as? I think maybe that particular page is a tad over my head. I wanted to ask if someone could help clarify some of this for me. Specifically:

  1. What data type should my column be for storing binary data (that represents my UUID)?
  2. What function should I use to convert my UUID to and from some binary value?
  3. Any more advance or tips someone could share?

Thanks so much!

NullHypothesis
  • 4,286
  • 6
  • 37
  • 79
  • Questions 1 and 2 are answered in the linked blog post's conclusion section. No idea what you expect us to answer for 3. – Shadow Aug 27 '16 at 20:02

3 Answers3

2

If you call MySQL's UUID(), you get a variant that is roughly chronological. So, if you need tend to reference "recent" records and ignore "old" records, then rearranging the bits in the UUID can provide better "locality of reference" (that is, better performance).

Version 4 does not provide such.

You can turn the UUID from the bulky 36-character string into a more compact, 16-byte, (Q1) BINARY(16) by code (Q2) in my UUID blog. That document discusses various other aspects of your question. (Q3)

The Percona link you provided gives some benchmarks 'proving' the benefit.

3M uuids taking 16 bytes each = 48MB. It is bulky, but not likely to cause serious problems. Still, I recommend avoiding uuids whenever practical.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thanks for the response; Just to be clear, I want to use UUID() and use UUID v1 not v4. I want to store this as a BINARY(16), using the code you provided for MariaDB (I assume that will work exactly the same as MySQL?). When you say 48MB you are saying the entire overhead just adds 48MB on disk total? – NullHypothesis Aug 28 '16 at 14:11
  • This comment here is perfect " If your SELECTs tend to be for "recent" uuids, then they, too, will be easily cached. If, on the other hand, your SELECTs often reach for old uuids, they will be random and not well cached. Still, improving the INSERTs will help the system overall." This is exactly what I expect to happen with my system! Thanks! – NullHypothesis Aug 28 '16 at 14:24
  • 3M*16 = 48MB; the actual value / actual change will probably be more because of overhead, indexes, etc. Even if it is 200MB, is that significant? – Rick James Aug 28 '16 at 16:29
  • (My mention of "v4" was aimed at Boris and anyone else getting UUIDs from places other than MySQL.) – Rick James Aug 28 '16 at 16:31
  • Hey @rick-james I'm sorry, one more question - do I want to place an index on this column? Thanks. – NullHypothesis Aug 28 '16 at 20:34
  • If you do `WHERE xx = ...`, you need `INDEX(xx)`. See my [_Index Cookbook_](https://mariadb.com/kb/en/mariadb/building-the-best-index-for-a-given-select/). – Rick James Aug 30 '16 at 18:14
0

I used UUID v4 on a recent project. The code to generate UUID v4 can be sourced here: PHP function to generate v4 UUID

The main difference is that we compressed it to 22 bytes case-sensitive format. This approach is also used by ElasticSearch.

The resulting values are stored simply as char(22).

Community
  • 1
  • 1
Boris Schegolev
  • 3,601
  • 5
  • 21
  • 34
0

2023 Update

Some MySql servers (though notably not MariaDB at the time of writing) now have built-in UUID_TO_BIN() and BIN_TO_UUID() functions. Though MariaDB currently has a UUID type which presumably performs these conversions automatically .

Matt
  • 699
  • 5
  • 15