2

I have table which stores user files e.g images. It has an auto increment primary key and so it's easy to guess via a url what the next/previous id is e.g mydomain/file/12. Whilst i have security in place to prevent unauthorised user from accessing someone else files i'd prefer to have a more complex url id which is difficult to guess.

The table will have a a lot of inserts/deletes so I've stuck to using a auto increment id for the primary key as opposed to using a uuid as a primary key due to it's associated performance issues.

So i was thinking of adding an additional column called uuid which i could use to retrieve files. Whilst mysql docs state that uuid's are designed as a number that is globally unique in space and time would i still need to implement a unique index on this column since there would be no database mechanism to prevent a collision - if it ever occurred?

Any advice appreciated.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
adam78
  • 9,668
  • 24
  • 96
  • 207

4 Answers4

2

If you want to insist that the column be unique, then create a unique index/constraint on it.

This will prevent manual inserts and updates from duplicating an existing column, even if the automatic mechanism generates a guaranteed-unique value on "normal" inserts.

That said, if performance is of paramount concern, then you might decide to -- essentially -- disable "manual" inserts and forego the unique index. That would be a compromise based on performance needs.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • is the automatic mechanism truly guaranteed-unique. i note mysql uses uuid version 1. Whats the difference between uuid version 1 and version 4? – adam78 Jan 26 '20 at 15:10
  • https://stackoverflow.com/questions/1155008/how-unique-is-uuid – Luuk Jan 26 '20 at 15:12
0

I was wondering the exact same thing. I inherited a db with indexed text UUIDs and the first thing that came to mind is that If you have an index on a textual UUID your writes will be slower because UUIDs are random and the database will need to figure out where to insert the INDEX (also there is the issue of index size, etc..). In order to prevent an unlikely collision you are definitely going to need a UNIQUE INDEX, reaching for one of the following two solutions could be useful to you.

In terms of having a more complex URL, I think reaching for something like a ULID might be better suited for you, since they are randomly generated in some sort of sequential order. The other thing you could potentially do is store and index the UUID in binary format.

Kevin Pimentel
  • 2,056
  • 3
  • 22
  • 50
0

If you are happy with your id selection in the context of the database, and you just want to ensure the sequential id is hidden from the URL and form the end user, then I would use something like hashids (https://hashids.org/) to convert your id into a string you can display in your URLs.

If on the other hand, you want the benefits of a globally unique identifier, but haven't gone with one as your primary key because of performance issues that come from randomness, then I would consider a GUID that's designed with data locality in mind. Namely UUIDv7 or it's typed version typeid (https://github.com/jetpack-io/typeid)

Regarding your question on whether you should enforce uniqueness in a uuid column, my take is that yes, you should. Without it, nothing prevents a database client from executing an INSERT statement for two different records with the same UUID.

Daniel
  • 1
  • 1
0

The database will let you add the column without the index (it's not actually required), but since you expect to do lookups for specific/exact values you will find the performance boost for the application far exceeds the cost.

The advantage is the index only needs to store the UUID and incrementing ID, and not the actual image data or any other metadata. This will allow the index maintenance for INSERTs (these values should both be immutable, so no UPDATE maintenance is needed) to be far more efficient than if the UUID were the primary key in the first place, but still give you most of the benefits of indexed lookups.

In other databases, you could further tune index maintenance cost by adjusting the index fill factor. This would reduce the frequency where adding a record forces splitting into a new page. However, MySQL does not support this.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794