60

I don't understand why

SELECT UUID();

Returns something like:

3f06af63-a93c-11e4-9797-00505690773f

But if I insert it into a binary(16) field (the UUID() function) with for instance a BEFORE INSERT trigger and run a select, it returns something like:

0782ef48-a439-11

Note that these two UUIDs are not the same data.

I realize binary and an UUID string doesn't look identical, but shouldn't the selected data at least be just as long? Otherwise how can it possibly be equally likely to be unique?

Is it better to store it as char(36)? I just need it to be unique to prevent duplicate inserts. It is never selected or used for joins.

EDIT:

before trigger would be like:

BEGIN

if NEW.UUID IS NULL THEN

NEW.UUID = UUID();

END IF

END
nickdnk
  • 4,010
  • 4
  • 24
  • 43
  • Show how you're doing the `INSERT`. – Barmar Jan 31 '15 at 11:38
  • 2
    `BINARY(16)` can only hold 16 characters. So it will contain the first 16 characters of the UUID that you store in it. – Barmar Jan 31 '15 at 11:40
  • According to http://stackoverflow.com/questions/10950202/how-to-store-uuid-as-number that's not the case. Also http://stackoverflow.com/questions/17726682/read-mysql-binary16-uuid-with-java – nickdnk Jan 31 '15 at 11:41
  • That answer uses `UNHEX()` to convert the UUID to a number that will fit into 16 bytes. – Barmar Jan 31 '15 at 11:42
  • No matter how you spin it, wouldn't that reduce the complexity by around 50% then? This has me very confused. – nickdnk Jan 31 '15 at 11:43
  • I don't understand the question. Complexity of what? – Barmar Jan 31 '15 at 11:43
  • Likelyness that the UUID is unique. – nickdnk Jan 31 '15 at 11:44
  • http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_uuid explains how it ensures it's unique. – Barmar Jan 31 '15 at 11:44
  • Okay we're completely talking past each other. I know that the initial UUID is more or less guaranteed to be unique,but if you cut away 50% of the data, it probably isn't anymore. So why do people hex or unhex an UUID and reduce the length by 50%? That makes no sense to me. – nickdnk Jan 31 '15 at 11:46
  • Why are you cutting away 50% of the data? When you use `UNHEX()` every 2 characters becomes one byte of the result, so it will then fit in `BINARY(16)`. – Barmar Jan 31 '15 at 11:48
  • There's something I just don't understand at all about hex, then. I'm sorry. – nickdnk Jan 31 '15 at 11:48
  • 2
    Hex digits `0` through `F` are the numbers `0` through `15` in decimal. Each hex digit corresponds to 4 bits, so 2 hex digits is 8 bits, which is 1 byte. – Barmar Jan 31 '15 at 11:51
  • 1
    So to properly save the UUID i should either: A. Hex it to compact it to 16 bytes, or B. Store it as char(36) instead? – nickdnk Jan 31 '15 at 11:53
  • Yes, that's correct. – Barmar Jan 31 '15 at 11:54
  • Thanks. I'll have to look into that then. – nickdnk Jan 31 '15 at 11:56

6 Answers6

132

So, as a response to comments. The correct way to store a 36-char UUID as binary(16) is to perform the insert in a manner like:

INSERT INTO sometable (UUID) VALUES
       (UNHEX(REPLACE("3f06af63-a93c-11e4-9797-00505690773f", "-","")))

UNHEX because an UUID is already a hexed value. We trim (REPLACE) the dashes in the statement to bring the length down to 32 characters (our 16 bytes represented as HEX). You can do this at any point before storing it, obviously, so it doesn't have to be handled by the database.

You may retrieve the UUID like this:

SELECT HEX(UUID) FROM sometable;

Just in case someone comes across this thread and is unsure how this works.

And remember: If you're selecting a row using the UUID, use UNHEX() on the condition:

SELECT * FROM sometable WHERE UUID = UNHEX('3f06af63a93c11e4979700505690773f');

or literal notation (as mentioned by Alexis Wilke):

SELECT * FROM sometable WHERE UUID = 0x3f06af63a93c11e4979700505690773f;

And NOT HEX()on the column:

SELECT * FROM sometable WHERE HEX(UUID) = '3f06af63a93c11e4979700505690773f';

The last solution, while it works, requires that MySQL HEXes all UUIDs before it can determine which rows match. It's very inefficient.

Edit: If you're using MySQL 8 you should have a look at the UUID functions as mentioned in SlyDave's answer. This answer is still correct, but it doesn't optimise the UUID indexes which can be done natively using those functions. If you're on < MySQL 8 or MariaDB, you can implement Devon's polyfill, which provides identical functionality on previous versions of MySQL.

nickdnk
  • 4,010
  • 4
  • 24
  • 43
  • Great question and answer. An off-topic question, when the client UI retrieves and update the record later, this UUID is the key. Should I remember the binary or the string at client side for posting back later? – Jeb50 May 30 '20 at 23:41
  • @Jeb50 The client (and your API) should handle the UUID as a string, with or without dashes. – nickdnk Jun 02 '20 at 17:29
60

As of MySQL 8 you can use two new UUID functions:

  • BIN_TO_UUID

    SELECT BIN_TO_UUID(uuid, true) AS uuid FROM foo;
    -- 3f06af63-a93c-11e4-9797-00505690773f
    
  • UUID_TO_BIN

    INSERT INTO foo (uuid) VALUES (UUID_TO_BIN('3f06af63-a93c-11e4-9797-00505690773f', true));
    

This method also supports rearranging the time component of the uuid to enhance indexing performance (by ordering it chronologically), simply set the second argument to true - this only works for UUID1.

If you are using the true on UUID_TO_BIN flag for indexing performance (recommended), you must also set it on BIN_TO_UUID otherwise it won't convert back properly.

See the documentation for further details.

friek108
  • 1,064
  • 1
  • 12
  • 23
SlyDave
  • 986
  • 12
  • 22
  • 2
    After a lot of research, we've decided to migrate from MySQL to PostgreSQL because they have a UUID data type (just like auto increment, it auto inserts and more importantly, you can just read it directly using a string). Otherwise we have to refactor our entire codebase everywhere we have an update or read using an id with both mysql 5.7 or 8. – friek108 Feb 17 '18 at 21:50
  • @friek108 MySQL has a UUID() function for that specific purpose as well. While it's not a datatype, it's equivalent to binary(16) and generates insert-friendly sequential (not random) UUIDs. – nickdnk Feb 28 '18 at 15:30
  • Thanks @nickdnk - but ultimately it would mean we have to update every single one of our queries across our entire codebase to include this function.. – friek108 Feb 28 '18 at 22:49
  • Or you could use a trigger, @friek108 :) – nickdnk Mar 04 '18 at 17:49
  • 4
    @friek108 - How do you make migration from MySQL to PostgreSQL easier than changing the codebase to UUID inserts? I don't really see how that would work. Can you elaborate? – nickdnk Aug 02 '18 at 15:52
22

Polyfill for BIN_TO_UUID and UUID_TO_BIN for MySQL 5 or MariaDB with the swap_flag parameter.

DELIMITER $$

CREATE FUNCTION BIN_TO_UUID(b BINARY(16), f BOOLEAN)
RETURNS CHAR(36)
DETERMINISTIC
BEGIN
   DECLARE hexStr CHAR(32);
   SET hexStr = HEX(b);
   RETURN LOWER(CONCAT(
        IF(f,SUBSTR(hexStr, 9, 8),SUBSTR(hexStr, 1, 8)), '-',
        IF(f,SUBSTR(hexStr, 5, 4),SUBSTR(hexStr, 9, 4)), '-',
        IF(f,SUBSTR(hexStr, 1, 4),SUBSTR(hexStr, 13, 4)), '-',
        SUBSTR(hexStr, 17, 4), '-',
        SUBSTR(hexStr, 21)
    ));
END$$


CREATE FUNCTION UUID_TO_BIN(uuid CHAR(36), f BOOLEAN)
RETURNS BINARY(16)
DETERMINISTIC
BEGIN
  RETURN UNHEX(CONCAT(
  IF(f,SUBSTRING(uuid, 15, 4),SUBSTRING(uuid, 1, 8)),
  SUBSTRING(uuid, 10, 4),
  IF(f,SUBSTRING(uuid, 1, 8),SUBSTRING(uuid, 15, 4)),
  SUBSTRING(uuid, 20, 4),
  SUBSTRING(uuid, 25))
  );
END$$

DELIMITER ;

--
-- Tests to demonstrate that it works correctly. These are the values taken from
-- https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin
--
-- If you run these SELECTs using the above functions, the 
-- output of the two columns should be exactly identical in all four cases.
SET @uuid = '6ccd780c-baba-1026-9564-5b8c656024db';
SELECT HEX(UUID_TO_BIN(@uuid, 0)), '6CCD780CBABA102695645B8C656024DB';
SELECT HEX(UUID_TO_BIN(@uuid, 1)), '1026BABA6CCD780C95645B8C656024DB';
SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,0),0), '6ccd780c-baba-1026-9564-5b8c656024db';
SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,1),1), '6ccd780c-baba-1026-9564-5b8c656024db';

Included are the SELECT samples from https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin that demonstrate that the above code returns the exact same results as the 8.0 function. These functions are considered DETERMINISTIC as they always produce the same output for a given input. See https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html

nickdnk
  • 4,010
  • 4
  • 24
  • 43
Devon Bessemer
  • 34,461
  • 9
  • 69
  • 95
  • Exactly what I was looking for. Thanks! – James Marks Nov 10 '19 at 19:13
  • 1
    You saved my day! Since my host just support MySQL 5 i was going to be forced to rewrite all my BIN_TO_UUID calls! – Michael Wallace Feb 04 '20 at 23:06
  • You should really add the SWAP_FLAG to both functions as that's what the native functions support: https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin and https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_bin-to-uuid – nickdnk Feb 19 '20 at 15:46
  • 1
    @Devon I figured it out and edited your answer. Thanks for pointing me in the right direction. – nickdnk Mar 06 '20 at 00:23
  • i tried to use this on my server, but when i use UUID_TO_BIN(uuid(),1) it gives me strange characters like ��6Z5�*,�U��).. why did i get this? – Codeboy Newbie Mar 09 '23 at 07:29
  • 1
    @MuhammadFauzi It produces a binary output. If you want to display the UUID, you'd have to do HEX(UUID_TO_BIN(uuid(),1)), or SELECT HEX(col) if you already put it into a binary column. – nickdnk Jun 12 '23 at 15:17
16

I am using MariaDB so BIN_TO_UUID functions family do not exist. I managed to get the corresponding values anyway.

bin -> hex

Here, uuid is the binary(16) value of an uuid; you'll use the value below to SELECT a readable version of it.

LOWER(CONCAT(
    SUBSTR(HEX(uuid), 1, 8), '-',
    SUBSTR(HEX(uuid), 9, 4), '-',
    SUBSTR(HEX(uuid), 13, 4), '-',
    SUBSTR(HEX(uuid), 17, 4), '-',
    SUBSTR(HEX(uuid), 21)
))

hex -> bin

Here, cc6e6d97-5501-11e7-b2cb-ceedca613421 is a readable version of an UUID, and you'll use the value below in a WHERE clause to seek for it.

UNHEX(REPLACE('cc6e6d97-5501-11e7-b2cb-ceedca613421', '-', ''))

Cheers

Alain Tiemblo
  • 36,099
  • 17
  • 121
  • 153
  • Why don't you just replace the dashes like in my answer? What's the advantage of doing 5 substring calls? – nickdnk Mar 22 '18 at 16:18
  • 3
    You are totally right. I don't know either, head to the handlebars i suppose. Thanks :) – Alain Tiemblo Mar 22 '18 at 20:35
  • Hehe. Okay. I'd also be doing the string-concat stuff in the application layer instead... there's really no reason to make the database do five HEX() and SUBSTR() per UUID when PHP, C#, java or whatever could easily insert the dashes by splitting the UUID. I actually don't work with the dashes anywhere in my application, since they're more or less redundant. – nickdnk Mar 23 '18 at 02:34
16

The other answers are correct. The UUID() function returns a 36 character string and that needs to be converted using the shown functions (UNHEX() or, on newer platforms, UUID_TO_BIN()).

However, if you use your own software to create your UUIDs, then you can use the Hexadecimal Literal notation instead.

So I would use the following with the MySQL UUID() function:

INSERT INTO sometable (id) VALUES (UNHEX(REPLACE(UUID(), '-', '')));  -- all versions
INSERT INTO sometable (id) VALUES (UUID_TO_BIN(UUID());               -- since v8.0

But use this in case I generate my own UUIDs;

INSERT INTO sometable (id) VALUES 0x3f06af63a93c11e4979700505690773f;

Similarly, you can use Hexadecimal Literals in your WHERE clauses:

SELECT * FROM sometable WHERE id = 0x3f06af63a93c11e4979700505690773f;

This will be faster if you do not have to convert your data to a UUID string each time.

Note: the 'x' in '0xaBc is case sensitive. The hexadecimal digits are not, however.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
  • Came here looking for a solution with literals... thanks! – Robert Penridge Sep 09 '19 at 22:04
  • if you are using a DB with charset not UTF-8 and a MariaDB Java connector with version >= 2.2.4, your insert should be like `INSERT INTO sometable (id) VALUES (UNHEX(REPLACE(CONVERT(UUID() using utf8mb4), '-', '')));` to guarantee uniqueness. – Davi Cavalcanti Oct 26 '20 at 23:16
  • 1
    @DaviCavalcanti I wonder why you'd need the "utf8" conversion. A UUID is only composed of ASCII characters anyway. – Alexis Wilke Oct 26 '20 at 23:21
-2

In MySQL 4.0 and above you can change the size of UUID like using the MID

SELECT MID(UUID(),1,32); # 32 characters long UUID
SELECT MID(UUID(),1,11); # 11 characters long UUID

As @nickdnk pointed out you shouldn't do this. The total length of UUID makes them unique. Striping a part of them can lead to non unique values.

Endre Soo
  • 11
  • 2
  • You should not change the size of a UUID. It's designed to be unique and the different parts/sections are generated by different sources. Taking a subset of the UUID like this is asking for trouble. – nickdnk Jul 17 '21 at 21:01
  • @nickdnk, Yeah, you're right. I'm updating my answer. – Endre Soo Jul 19 '21 at 10:43