2

I have a Table

CREATE TABLE myTable (
    id bigint,
    other_id text,
    my_set SET<bigint>,
    my_date timestamp,
    PRIMARY KEY (id, other_id)
);

I want to have a common TTL within all columns, so after the TTL expires, the row disappears.

I manage to INSERT new rows with:

INSERT INTO myTable (id, other_id, my_date, my_set) 
VALUES (1,'foo','2014-10-20 12:05:08-0300', {22}) 
USING TTL 20;

I also can UPDATE my set adding new elements with:

UPDATE myTable USING TTL 20 SET my_set=my_set + {99}
                             WHERE id=1 AND other_id='foo';

BUT my problem is that this new element has a new TTL, so after some time the number 22 disappears and 99 is still present.

How can I add a new element to my set with the same TTL of the other elements of the set?

The solution I am thinking of is to make two queries:

  1. Ask my_date column his TTL
  2. With that TTL make the UPDATE adding a new element to my_set

Is there a better solution?

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
George C
  • 1,168
  • 13
  • 30

2 Answers2

1

I don't know much about what you solution is trying to achieve. But from my experience TTLs inside of large Collections is bad news (or having more than a couple of dozen of elements in an Collection).

Anyway since below the covers Sets are implemented as discrete columns in the form of (name=setName:valueOfItem, value=) (see this article for more) you'll have to hack something on top. My suggestion would be to encapsulate your Collection in a clustering column like so:

CREATE TABLE my_table (
    id bigint,
    other_id text,
    collection_ttl timeuuid,
    my_set SET<bigint>,
    my_date timestamp,
    PRIMARY KEY (id, other_id, collection_ttl)
);

Where the collection_ttl could just be the insertion time, and is essentially meant to track the TTL of the contained Collection.

Or you could unroll your Set into a Clustering column and achieve essentially the same thing by doing:

CREATE TABLE my_table_no_set (
    id bigint,
    other_id text,
    my_set_key bigint,
    my_set_value text,
    my_date timestamp,
    PRIMARY KEY (id, other_id, my_set_value)
);

This would be much friendlier to Cassandra if you have a fairly large Collection, and you can just do a TTL on the order_id or something similar.

fromanator
  • 854
  • 4
  • 9
  • Hi @fromanator, thanks for the answer!. I am going to work with sets of 100-150 elements. I think I will go with the first option you proposed. Do you think that 100-150 elements in the set will impact in Cassandra's performance? – George C Mar 29 '16 at 12:20
  • It's hard to say, I only have experience seeing poor performance with collections that have around 1000 items. In general, Collections in Cassandra are only meant for "small" number of items. Casssandra reads Collections in their entirety (even if you just want a single item from it). If your use case involves reading 1 or ranges of data in your collection then you're usually better off "unrolling" your collection with a clustering column like in my second example. – fromanator Mar 31 '16 at 04:02
  • Thank you very much @fromanator!! , I have to read all the values of the set, so I think that in my special case is the same. I will take your comment in mind in case I have to read only some values. – George C Mar 31 '16 at 18:23
0

You can use the TTL() function in a SELECT

SELECT TTL(my_set) FROM myTable;

Then use that value... but I do not think that will work for what you are trying to do.

Instead I would add a date of when the row is created. More or less, something like this:

CREATE TABLE myTable (..., created_on timestamp);
INSERT INTO myTable (..., created_on) VALUES (..., toTimestamp(now()));
...
SELECT created_on FROM myTable WHERE ...
INSERT INTO myTable ... USING TTL now - created_on ...

(instead of toTimestamp(now()) you may have to use dateOf(now()) depending on the version of CQL you are using; see documentation. You may also use your own timestamp value, of course.)

If now - created_on is smaller than 1, then you should not process that INSERT. (actually, I read somewhere that the minimum value of a TTL should probably be at least 3 seconds, but even 3 is really small for a database TTL...)

There is also a writetime() function. But I'm not too sure whether that would work for you. You can of course give it a try:

SELECT writetime(id) FROM myTable WHERE ...

It is not unlikely that writetime() changes on an UPDATE, but I do not know whether it changes on a per column basis or per row and it is not recommended to use such system information for your own data (How to retrieve the timestamp from cassandra?).

Community
  • 1
  • 1
Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156