11

Is it possible to have identity column in documentDB for autoincrement, it's usually handy for IDs? Any link or hint relating to it can be useful.

Thanks.

Sandip Bantawa
  • 2,822
  • 4
  • 31
  • 47

2 Answers2

9

AFAIK, DocumentDB does not have this kind of concept. Each document in DocumentDB has an id property which uniquely identifies the document but that id field is of type string. When creating a document, you may choose not to specify a value for this field and DocumentDB assigns an id automatically but this value is a GUID. Thus if you wish to achieve auto increment type functionality, you would need to handle this on your own. However please do keep in mind that it is a string type property so even if you're handling this on your own, you would need to pad your string with zeros so that values are returned in proper order i.e. 1, 2, 3 etc. instead of 1, 10, 11, ... 19, 2, 20 ....

Gaurav Mantri
  • 128,066
  • 12
  • 206
  • 241
  • can we explore data on DocumentDb online, like we view blobs via explorer?? – Sandip Bantawa Nov 02 '14 at 17:01
  • Sure you can. 3 Options: 1) Do it in preview portal (https://portal.azure.com). 2) Use My DocumentDB Explorer (http://geekswithblogs.net/shaunxu/archive/2014/09/17/quotmy-documentdbquot---a-simple-web-based-documentdb-management-tool.aspx). 3) Cloud Portam (Disclosure - This is the tool I am building) - http://cloudportam.com/features/documentdb. – Gaurav Mantri Nov 02 '14 at 17:11
  • Even Azure portal has explorer, just found :-) – Sandip Bantawa Nov 03 '14 at 14:34
  • That was the 1st option I listed above :) – Gaurav Mantri Nov 03 '14 at 14:47
  • Ya it but didnt get it, coz it was down below and unhighlighted :-) – Sandip Bantawa Nov 03 '14 at 14:49
  • This: `DocumentDB assigns an id automatically` is deceiving. DocumentDB itself doesn't do this, it's the library you use what auto-generates the `id` field. DocumentDB will complain if the `id` field is missing. – Parziphal Oct 23 '15 at 18:56
  • @Parziphal Actually if a value for the _id property is not supplied by the client during its call to perform an insert, it is created by the system (DocumentDB). It is actually the _rid property internally. https://azure.microsoft.com/en-us/documentation/articles/documentdb-resources/ – dmcquiggin Aug 21 '16 at 14:18
7

There is still (as of August 2016) no built-in Identity functionality; there is a request for such on the Feedback forum for DocumentDB that can be voted for here. But bear in mind that an automatic counter is somewhat against most NoSQL design philosophies.

However, there are a couple of approaches that can be followed as workarounds, and both have a caveat; the first way is to use a Counter Document updated within a Stored Procedure:

STORED PROCEDURE AND COUNTER DOCUMENT

  1. Create a Document that will contain a numeric value property. Either cache its self-link, or preferably create it with a known id such as "__DocumentType_Counter", and then you can use the UriFactory to build a document link, which allows efficient 'direct' access to this Counter Document.

  2. Create a Stored Procedure that accepts a Document to be inserted, and the Uri of the Counter Document. Within this Stored Procedure, select the value of the Counter, increment it, assign it to the relevant property of the Document to be inserted, and if this is successfully persisted, save the incremented value as an update to the Counter Document.

This approach will work, as Stored Procedures are automatically executed as a Transaction.

Caveat: But, Transactions are currently only scoped within the boundary of a Collection - you cannot use this approach with a Counter Document in one Collection when inserting a Document into another.

REDIS

The second option to explore, which adds a little more complexity and a little extra cost, and is not fully transactional in terms of the increment and insert of a Document (but in my personal experience provides greater throughput), is to create an Azure Redis cache, and keep the Counter value in the Redis Key Value store, which is then queried and incremented using a LUA script.

This ensures the query of the Counter value, and its increment, is an atomic transaction as Redis is single threaded, and LUA scripts basically block until completion (but execute very quickly).

It is also possible to use MULTI - EXEC commands.. these have their own issues, and need to be investigated to see if they are relevant to you.

More information on Redis transactions can be found here - Transactions in Redis

Caveat: If persisting your Document fails, you have 'ghost' Ids, which may or may not be acceptable to you.

Update: In response to the comment regarding "addition of RU cost to every create operation", and "serialized all writes" - yes, obviously, when using the Counter Document approach this is the case when creating documents that require an incremented Identity property; no other inserts would be affected.

Both of these 'costs' are necessary to meet the desired functionality when using DocumentDB - that does not inherently support the concept of an Identity property; persistence to a Document to ensure recovery / continuity, serialisation for consistency. RU cost is, in common usage, negligible and techniques such as batching can be used to offset this.

There is currently no DocumentDB alternative that does not persist a counter or serialise writes, while guaranteeing integrity for an incremented counter from multiple callers.

The advice from Andrew Liu of the DocumentDB team is, indeed, to use a Counter Document, as mentioned in a linked question - please also note the comments below Andrew's answer.

As stored procedures are precompiled in DocumentDB they are an efficient way of performing "> 1" operations. They are also currently the optimal way to perform a transaction. 'Somewhere' has to be the single source of truth for the counter; as mentioned previously, one option is Redis, but as I point out, that is not part of a transaction that will be rolled back if document insert fails, which may or may not be acceptable; in Event Sourcing, for example, it is not.

dmcquiggin
  • 3,230
  • 24
  • 37
  • 3
    By storing a counter in a document, you've just added RU cost to every single create operation (cost+time for read+write of `counter` document, as well as cost+time for executing stored procedure code). And you've forcibly serialized all writes, since you'll be locking the `counter` document for each create operation. – David Makogon Jun 05 '16 at 12:17
  • 1
    @DavidMakogon - I would be very interested to hear your alternative suggestion, that does not have such behaviours. – dmcquiggin Aug 09 '16 at 21:42