7

DynamoDB does not have the option to automatically generate a unique key for you.

In examples I see people creating a uid out of a combination of fields, but is there a way to create a unique ID for data which does not have any combination of values that can act as a unique identifier? My questions is specifically aimed at lambda functions.

One option I see is to create a uuid based on the timestamp with a counter at the end, insert it (or check if it exists) and in case of duplication retry with an increment until success. But, this would mean that I could potentially run over the execution time limit of the lambda function without creating an entry.

Quentin Hayot
  • 7,786
  • 6
  • 45
  • 62
Philiiiiiipp
  • 705
  • 1
  • 9
  • 24
  • 1
    What programming language are you using? Typically you generate a UUID in your code using an applicable library. No need to code it yourself, or retries/execution time issues with this approach. – dmulter Aug 27 '18 at 15:22
  • I am using nodejs, but especially in an async environment it seems impossible to me to create a uuid that does not have potential retries/execution time issues. – Philiiiiiipp Aug 28 '18 at 07:16

6 Answers6

21

If you are using Node.js 8.x, you can use uuid module.

var AWS = require('aws-sdk'),
    uuid = require('uuid'),
    documentClient = new AWS.DynamoDB.DocumentClient();
[...]
        Item:{
            "id":uuid.v1(),
            "Name":"MyName"
        },

If you are using Node.js 10.x, you can use awsRequestId without uuid module.

    var AWS = require('aws-sdk'),
        documentClient = new AWS.DynamoDB.DocumentClient();
[...]
    Item:{
        "id":context.awsRequestId,
        "Name":"MyName"
    },
hatted
  • 1,535
  • 1
  • 15
  • 26
  • 9
    Reusing the awsRequestId is an interesting no dependency needed solution if you just need one UUID per request and you're not using the request ID in the database for anything. Just be careful if you store it elsewhere that could cause a conflict or not to use it twice in the same request. – Chris Blackwell Jun 22 '20 at 23:13
  • What is `context` supposed to be? Downvoting due to lack of clarification. – Epic Speedy Oct 23 '21 at 20:53
  • 3
    `context` is the second argument of the Lambda handler: `exports.handler = async function(event, context) {}` – Anton Poznyakovskiy Dec 28 '21 at 13:05
9

The UUID package available on NPM does exactly that.

https://www.npmjs.com/package/uuid

You can choose between 4 different generation algorithms:

  • V1 Timestamp
  • V3 Namespace
  • V4 Random
  • V5 Namespace (again)

This will give you:

"A UUID [that] is 128 bits long, and can guarantee uniqueness across space and time." - RFC4122

The generated UUID will look like this: 1b671a64-40d5-491e-99b0-da01ff1f3341
If it's too long, you can always encode it in Base64 to get G2caZEDVSR6ZsAAA2gH/Hw but you'll lose the ability to manipulate your data through the timing and namespace information contained in the raw UUID (which might not matter to you).

Community
  • 1
  • 1
Quentin Hayot
  • 7,786
  • 6
  • 45
  • 62
  • Can it generate alpha numeric id's to reduce the length? – Kannaiyan Aug 28 '18 at 15:56
  • You can encode it in base64 to reduce length. But UUIDs contain usable information about date, namespaces and so. If you use your UUIDs only as keys and don't need to manipulate your data according to the information those key contains, go ahead and encode them. – Quentin Hayot Aug 28 '18 at 16:02
  • Wouldn't these still have the slight chance of collision? How can you guarantee it's actually unique? Could two lamba processes theoretically create the uuid timestamp generation on the same tick with the same randomness? I don't understand how these can be guaranteed uuids – neaumusic Jun 04 '22 at 01:56
  • @neaumusic, I recommend that you read the RFC linked in my answer to learn details about how it works. – Quentin Hayot Jun 06 '22 at 23:21
  • through reading the RFC it does not seem uniqueness is not guaranteed – neaumusic Jun 09 '22 at 02:12
  • @neaumusic the comments and answers on this question give an idea of how safe it is: https://stackoverflow.com/questions/1155008/how-unique-is-uuid#:~:text=UUIDs%20are%20unique%20%22for%20practical,make%20that%20possibility%20statistically%20significant. – Quentin Hayot Jun 09 '22 at 08:49
  • 1
    TL;DR: You have more chance of getting hit directly in the face by a meteor than getting a collision. But if it's still too high of a probability for the criticity of your system, you can check if it already exists in database and regenerate one if it does... But honestly, it would be less costly to manually deal with THE collision than to test every insert. (And again, as of today, there is probably no UUID used twice in all systems on earth combined). – Quentin Hayot Jun 09 '22 at 08:51
8

awsRequestId looks like its actually V.4 UUID (Random), code snippet below:

exports.handler = function(event, context, callback) {
    console.log('remaining time =', context.getRemainingTimeInMillis());
    console.log('functionName =', context.functionName);
    console.log('AWSrequestID =', context.awsRequestId);
    callback(null, context.functionName);
};

In case you want to generate this yourself, you can still use https://www.npmjs.com/package/uuid or Ulide (slightly better in performance) to generate different versions of UUID based on RFC-4122

For Go developers, you can use these packages from Google's UUID, Pborman, or Satori. Pborman is better in performance, check these articles and benchmarks for more details.

More Info about Universal Unique Identifier Specification could be found here.

Muhammad Soliman
  • 21,644
  • 6
  • 109
  • 75
  • Useful for quickly testing in development but I think it's better entity IDs are really globally unique. – C.M. Feb 07 '20 at 03:32
2

We use idgen npm package to create id's. There are more questions on the length depending upon the count to increase or decrease the size.

https://www.npmjs.com/package/idgen

We prefer this over UUID or GUID's since those are just numbers. With DynamoDB it is all characters for guid/uuid, using idgen you can create more id's with less collisions using less number of characters. Since each character has more ranges.

Hope it helps.

EDIT1:

Note! As of idgen 1.2.0, IDs of 16+ characters will include a 7-character prefix based on the current millisecond time, to reduce likelihood of collisions.

Kannaiyan
  • 12,554
  • 3
  • 44
  • 83
  • I will check that out! But you accept the fact that there will be collisions? It seems to me that an approach based on random values can be a problem because collisions will increase in the long run if its not based on a timestamp. – Philiiiiiipp Aug 28 '18 at 07:09
  • Added info on timestamp. It depends how fast you generate and how many id's you want by millisecond. There is no standard that can generate collision free generation. More id's increase the width of the id generator to reduce collision. – Kannaiyan Aug 28 '18 at 16:00
2

if you using node js runtime, you can use this

const crypto = require("crypto")
const uuid = crypto.randomUUID() 

or

import { randomUUID } from 'crypto'
const uuid = randomUUID()  
Ukpa Uchechi
  • 604
  • 1
  • 6
  • 10
0

Here is a better solution.

This logic can be build without any library used because importing a lambda function layer can get difficult sometimes. Below you can find the link for the code which will generate the unique id and save it in the SQS queue, rather than DB which will incur the cost for writing, fetching, and deleting the ids.

There is also a cloudformation template provided, which you can go and deploy in your account, and it will setup the whole application. A detailed explanation is provided in the link.

Please refer to the link below.

https://github.com/tanishk97/UniqueIdGeneration_AWS_CFT/wiki

  • Please [add context to your link](https://meta.stackexchange.com/questions/8231/are-answers-that-just-contain-links-elsewhere-really-good-answers/8259#8259) so your fellow users will have some idea what it is and why it's there, and then quote the most relevant part of the page you're linking to in case the target page is unavailable. Answers that are little more than a link may be deleted. – Mickael B. May 03 '20 at 14:08