4

TL;DR: Are the IDs that are auto-generated by DocumentDB supposed to be GUIDs or UUIDs, and is there actually a difference? If they are UUIDs, then which variant/version of UUID?

Background: Some of the DocumentDB client libraries will auto generate an ID for you if you do not provide one. I have seen it mentioned in the Azure blog and in several related questions that the generated IDs are GUIDs. I know there is some discussion over whether GUIDs are UUIDs, with many people saying that they are.

The problem: However, I have noticed that some of the IDs that DocumentDB auto-generates do not follow the UUID RFC, which allows only the digits 1-5 in the "version" nibble (V in xxxxxxxx-xxxx-Vxxx-xxxx-xxxxxxxxxxxx). DocumentDB generates IDs with any hex digit in that nibble, for example d981befd-d19b-ee48-35bd-c1b507d3ec4f, whose version nibble is the first e of ee48.

It is possible that this depends on which client is used to create the documents. In our DocumentDB database, we have documents with the third grouping dde5, 627a, fe95, and so on. These documents were stored from within a stored procedure by calling Collection.createDocument() with the options {'disableAutomaticIdGeneration': false}. Other documents that I create through the third party DocumentDB Studio application always have 4xxx in the third grouping, which is a valid UUID version. However, documents that I create through the Azure portal have non-standard third groupings like b359.

Question: Are the auto-generated DocumentDB IDs supposed to be GUIDs or UUIDs, and is there actually a difference? If UUIDs, then which variant?

Community
  • 1
  • 1
shoover
  • 3,071
  • 1
  • 29
  • 40

1 Answers1

6

Poking around in the source code on GitHub, I found that the various client and server side libraries use several different methods for creating what they're calling a GUID (in some libraries) or a UUID (in other libraries).

The nodejs client, Javascript client, and server-side library manufacture what they call a GUID by concatenating series of hex digits and hyphens. Note that these are random, but do not comply with the rules for creating RFC4122 version 4 UUIDs.

The Python client and Java client call their respective standard library methods to generate a random (version 4) UUID.

The .NET client is available via NuGet, but the source code is not yet published.

Summary:

  • Microsoft is not making a distinction between GUID and UUID in their client libraries. They are using the terms interchangeably.
  • What you get for a GUID/UUID depends on which client library you're using to call DocumentDB when you create your documents.
shoover
  • 3,071
  • 1
  • 29
  • 40