50

I have a requirement to generate unique Long ids for my database primary key column.

I thought i can use UUID.randomUUID().getMostSignificantBits() but sometimes its generating some negative long also which is problem for me.

Is it possible to generate only positive long from UUID ?There will be like billions of entries so i want that each generated key must be unique.

Cory Klein
  • 51,188
  • 43
  • 183
  • 243
Saurabh Kumar
  • 16,353
  • 49
  • 133
  • 212
  • Why are you don't use sequence? Are you able to use things like that? Or UUID this is a solution which you have to use? – Michał Ziober Mar 03 '13 at 10:49
  • can you explain more about sequence – Saurabh Kumar Mar 03 '13 at 10:53
  • What DB do you use? What db framework do you use(JDBC, Hibernate, myBatis)? – Taky Mar 03 '13 at 11:00
  • Please read about PostgreSQL sequence http://www.neilconway.org/docs/sequences/. This is just an example. In your database you should fine something similar to it. – Michał Ziober Mar 03 '13 at 11:01
  • I am using mysql. I want to do it in application side because if i will do it in database side i have to fire one more query again to get the id of the row..and i want to avoid that – Saurabh Kumar Mar 03 '13 at 11:04
  • Why do you think MostSignificantBits of the UUID will be unique? The UUID will be unique, but not necessarily the MostSignificantBits of the UUID. – user93353 Mar 03 '13 at 11:19
  • yes u are right user93353 . i just checked the java docs of UUID. – Saurabh Kumar Mar 03 '13 at 11:22
  • This scenario becomes more compelling if your application is database agnostic and you have to manage your own sequence. Keeping the primary key as a uuid string by far may not be as efficient compared to a number – Shajee Lawrence Oct 20 '18 at 07:29
  • Most databases that I have used(including MySQL) will return to you the ID that was generated for the newly inserted item on the insert method ie: the insert method returns a long which is the ID of the item that was inserted, likewise if you insert a list of items it will return a list of longs with indexes matching the list that was inserted – ProjectDelta May 07 '19 at 11:44

8 Answers8

64
UUID.randomUUID().getMostSignificantBits() & Long.MAX_VALUE

The reason why this works is, when you do bitwise & with 1 it allows the same digit to pass as it is and when you do bitwise & with 0 it blocks it and result is 0. Now, Long.MAX_Value in binary is

0111111111111111111111111111111111111111111111111111111111111111 

this is 0 followed by 63 1s (total is 64 bits, it's long in java)

So when you bitwise & a number X with this above number then you will get the same number X except that the leftmost bit is now turned into a zero. Which means you've only changed the sign of that number and not the value.

scytale
  • 12,346
  • 3
  • 32
  • 46
Sasha Pechenyi
  • 656
  • 6
  • 2
12

As the others have written, long does not have enough space for a unique number. But in many cases a number may be unique enough for a specific use. For example, a timestamp with the nanosecond precision is often good enough. To get it, shift the current milliseconds 20 bits left to allocate space for nanoseconds and then overlay it with the nanoseconds:

(System.currentTimeMillis() << 20) | (System.nanoTime() & ~9223372036854251520L);

The nano & ~9223372036854251520L part takes the current nanoseconds and sets the first 44 bytes to 0, leaving only the right 20 bits which represent nanoseconds up to one millisecond (999999 nanos) It is the same as:

nanoseconds & ~1111111111111111111111111111111111111111111100000000000000000000

Side note: nanoseconds should not be used to represent the current time because their starting point is not fixed in time and because they are recycled when they reach the maximum.

You can use any other bit manipulation. It is usually good to take into account the current time and something else such as the current thread id, process id, ip.

Daniel Nuriyev
  • 635
  • 6
  • 10
8

Take a look at http://commons.apache.org/sandbox/commons-id//index.html It has a LongGenerator that can give you exactly what you need.

In addition if you are using Hibernate then you can ask it to generate IDs for you (it has several algorithms you can choose from), in if not you can just take a look at their implementation for example http://grepcode.com/file/repo1.maven.org/maven2/hibernate/hibernate/2.1.8/net/sf/hibernate/id/TableHiLoGenerator.java#TableHiLoGenerator)

Doron Manor
  • 576
  • 4
  • 5
  • LongGenerator will create a sequential ID that means it can collide. While GUID is universally unique. It may suffice Saurabh's requirement but I think this is not an entirely correct answer – Sap Oct 09 '14 at 14:48
  • Apache commons-id is not available for download? Any other alternative? – JavaTechnical Jan 13 '15 at 15:21
5

This code is inspired by @Daniel Nuriyev's answer. But, instead of using nano-time, a counter (or discriminator as I've seen it called) is used when collisions occur in the same millisecond:

private static long previousTimeMillis = System.currentTimeMillis();
private static long counter = 0L;

public static synchronized long nextID() {
    long currentTimeMillis = System.currentTimeMillis();
    counter = (currentTimeMillis == previousTimeMillis) ? (counter + 1L) & 1048575L : 0L;
    previousTimeMillis = currentTimeMillis;
    long timeComponent = (currentTimeMillis & 8796093022207L) << 20;
    return timeComponent | counter;
}

This method generates a semi-unique ID by packing a millisecond timestamp-component together with a counter-component. The algorithm allows for roughly a million (or 1048575 to be exact) unique IDs to be generated in the same millisecond before collisions start to occur. Unique IDs are generated until the year 2248 at which point it will wrap around and start at 0 again.

The ID-generation is done as follows:

Milliseconds since epoch:

|0|000000000000000000000010110111101111100110001001111100101011111|

Bitwise AND with (8796093022207L):

|0|000000000000000000001111111111111111111111111111111111111111111|

to give you the 43 least significant bits as the time-component.

Then shift this to the left by 20 bits to give you:

|0|0010110111101111100110001001111100101011111|00000000000000000000|

Bitwise OR with 20 bits of counter (e.g. if counter is 3) to give you:

|0|0010110111101111100110001001111100101011111|00000000000000000101|

Only 43 bits (and not 44) are used for the time-component as we do not want to allow the most significant bit (which is the sign of the number) to be changed. This results in only positive IDs to be generated.

wcmatthysen
  • 445
  • 4
  • 19
  • 1
    It works a charm in Kotlin. Thank you so much! Work of art. If you then use an AtomicLong to store the result it would make it thread safe? – Beezer Apr 04 '22 at 16:00
  • @Beezer, the `nextID` method is thread-safe due to it being marked with the `synchronized` keyword (so no extra work is needed to make it thread-safe). It is simpler to make the method itself synchronized than to use `AtomicLong`. If you use `AtomicLong` for both the `previousTimeMillis` and `counter` variables` then you still need a locking object to ensure both these variables are updated atomically. – wcmatthysen Apr 05 '22 at 18:14
3

I just came across this solution. I am for the time being trying to understand the solution.It says Java implementation of twitter snowflake. 64 bit sequential ID generator based on twitter snowflake ID generation algorithm.

https://github.com/Predictor/javasnowflake

Any suggestions are welcome.

Saurabh Kumar
  • 16,353
  • 49
  • 133
  • 212
  • But again i see public synchronized String generateLongId(). The synchronized block will degrade performance in the long run, – Saurabh Kumar Mar 03 '13 at 11:41
  • 1
    How does long run or short run make a difference in any performance degradation caused by a synchronize? – user93353 Mar 03 '13 at 12:27
  • 2
    @SaurabhKumar I highly doubt you will run into performance issues with the small synchronized block. While synchronized is generally slower its not always slower than CAS (ie Java's NIO also google had a great talk on this) and its definitely faster than any database serial id generator (assuming you need the ID prior, database roundtrip ... etc). You should know there are also an enormous amount of other things that are synchronized in Java (most servlet containers do it somewhere). – Adam Gent Sep 25 '13 at 02:16
  • Well, @user93353 if you run it two times you won't notice the performance degradation. If you run it 2 million times you're missing lunch :D – Erk Aug 02 '18 at 16:45
2

You can also use a Time-Sorted Identifier (TSID), which is derived from Twitter's Snowflake ID.

A TSID is very similar to other ideas that have been shown here. There is no need to explain what others explained well a long time ago.

Example code:

public class TSID {

    private static final int RANDOM_BITS = 22;
    private static final int RANDOM_MASK = 0x003fffff;
    private static final long TSID_EPOCH = Instant.parse("2020-01-01T00:00:00.000Z").toEpochMilli();
    private static final AtomicInteger counter = new AtomicInteger((new SecureRandom()).nextInt());

    public static long next() {
        final long time = (System.currentTimeMillis() - TSID_EPOCH) << RANDOM_BITS;
        final long tail = counter.incrementAndGet() & RANDOM_MASK;
        return (time | tail);
    }
}

The above code was copied from Tsid and adapted to create a small example. It can generate up to 4,194,304 IDs per millisecond without collisions, assuming there is only one ID generator.

If you have concerns about security, i.e. generation time leakage and sequence guessing, check out Francesco's ID Encryptor library.

Note that if you intend to use 64-bit identifiers in Javascript, which has 52-bit precision, you need to convert them to some string format, i.e. hex, base-32, base-62, etc.

For more information, read these excellent articles by Vlad:

fabiolimace
  • 972
  • 11
  • 13
1

I want to do it in application side because if i will do it in database side i have to fire one more query again to get the id of the row..and i want to avoid that.

NO! You can use an AUTOINCREMENT primary key, and in JDBC retrieve the generated key with the INSERT.

String insertSQL = "INSERT INTO table... (name, ...)"
        + " VALUES(?, ..., ?)";
try (Connection connection = getConnection();
        PreparedStatement stmt = connection.prepareStatement(insertSQL,
                Statement.RETURN_GENERATED_KEYS)) {
    stmt.setString(1, ...);
    stmt.setInt(2, ...);
    stmt.setBigDecimal(3, ...);
    ...
    stmt.executeUpdate();

    ResultSet keysRS = stmt.getGeneratedKeys();
    if (keysRS.next()) {
        long id = keysRS.getInt(1);
    }
}

This is more efficient, and definitely easier, and safer. UUID are 128 bits. Taking just 64 bits reduces its uniqueness. So at least subjectively not 100% perfect. At least XOR (^) both long parts.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
0

A bit late to reply but anyone reading this now, you can also implement LUHN algorithm to generate unique Id for your Primary Key. We have been using it for more than 5 years in our product and it does the job.

Priyam
  • 148
  • 1
  • 11