1

I would like to generate unique id for entity and store the entity in Cassandra database (only if entity with generated id does not exist already).

After id generation I check in db if there is any entity with the same id. If not, then the entity is saved. Sample code from MyService class:

synchronized (MyService.class) {
    do {
        id = generateId();
    } while (myDao.find(id) != null);
    sampleObject.setId(id);
    myDao.create(sampleObject);
}

In MyDao to save entity I'm using:

cassandraOperations.insert(sampleObject);

What is the best practice to ensure that already generated id does not exist in database? I feel that this synchronize block is not the most efficient solution. Or maybe there is any other way in which I can ensure that entity is inserted only if there is no entity with the same id in database?

Lukasz_Plawny
  • 429
  • 7
  • 17
  • Check [this](https://stackoverflow.com/questions/3935915/how-to-create-auto-increment-ids-in-cassandra) – Russiancold Oct 16 '17 at 21:45
  • Do you use uuid as id? If so the probability of collision is equal to 0. So you don't even need to check for existance – rvit34 Oct 16 '17 at 22:22
  • In my case id is an alphanumeric String. Even with UUID there is very low collision propability. – Lukasz_Plawny Oct 16 '17 at 22:25
  • 2
    Since you use cassandra I recommend you to use UUID instead of alphanumeric string. You can additionally perform get requests if you want to. But syncronized here is extra and having no sence. – rvit34 Oct 16 '17 at 22:43
  • Lets assume that I use UUID. I still need to have some specific alphanumeric String, which needs to be unique. So even with UUID used as ID I need to perform this check. So I think that using UUID in my case is not a solution. – Lukasz_Plawny Oct 16 '17 at 22:52
  • UUID is already unique. Why do you need some aditional specific alphanumeric String? – rvit34 Oct 16 '17 at 23:00
  • The alphanumeric String may contain any of alphabet letters, this requirement is not fullfilled by UUID. – Lukasz_Plawny Oct 16 '17 at 23:09
  • you can use guid or auto increase ID in database – Jswq Oct 17 '17 at 01:01
  • 3
    @Lukasz_Plawny "Even with UUID there is very low collision propability." The probability that your entire development team is killed by wolves in the same night, at precisely the same point in time where a meteor lands on your data center, is much higher than the collision probability of UUIDs. – Jan Dörrenhaus Oct 17 '17 at 08:27
  • @JanDoerrenhaus you are right, collision probability is very low. Right now I'm going to remove this additional check, synchronized block and just implement generateId() method in similar way to random UUID generation (but with Characters, which needs to be used to generate my custom id) . I will still use String as a primary key generated using SecureRandom. – Lukasz_Plawny Oct 17 '17 at 08:38

2 Answers2

8

Type 1 uuids (timeuuid) guarantees no collisions provided you create less than 10k uuids per millisecond (per host). So this is easiest solution with no impact on throughput or latency. If you use a type 4 random uuid (uuid type) the chance of a collision is less than a super volcano erupting from under your datacenter but it doesn't provide the guarantee timeuuid does.

If you want you can also use lightweight transactions with the IF NOT EXISTS clause on your query.

INSERT INTO keyspace_name.table_name
  ( identifier, column_name...)
  VALUES ( value, value ... ) IF NOT EXISTS

This will only apply the mutation if the row does not already exist. The query will return an applied field that tells you if it succeeded or not. If another inserted same thing only one would work.

https://docs.datastax.com/en/cql/3.1/cql/cql_reference/insert_r.html#reference_ds_gp2_1jp_xj__if-not-exists

This will be slower since it uses paxos, which takes multiple hops around your cluster to complete.

Chris Lohfink
  • 16,150
  • 1
  • 29
  • 38
0

UUID is safe solution but sometimes id is not very unique. For example SSN. To tackle it Cassandra has support for lightweight transactions. https://docs.datastax.com/en/cql/3.3/cql/cql_using/useInsertLWT.html

On Application side no synchronized is required. Cassandra resultset will return applied flag true if record is written.
Function to write:

ResultSet insertIfNotExists(String Id) {
    String cql = QueryBuilder.insertInto("table_name")
            .value("id", Id)
            .ifNotExists()

    return cassandraOperations.query(cql);
}

Usage:

ResultSet rs = insertIfNotExists("abc123")
if (rs.wasApplied()) {
   log.info("success")
}
Valchkou
  • 369
  • 5
  • 8