How to query a simple epoch-based partition column in Cassandra?

Question

I do not know the primary key while querying in Cassandra.
All I know is the time range for which I want to get the data.
So I have kept the partition key of my table as the epoch.

CREATE TABLE tmp2 (
    epoch bigint,
    primary key (epoch)
);

INSERT INTO tmp2 (epoch) values (unixTimestampOf(now()));
INSERT INTO tmp2 (epoch) values (unixTimestampOf(now()));
INSERT INTO tmp2 (epoch) values (unixTimestampOf(now()));
INSERT INTO tmp2 (epoch) values (unixTimestampOf(now()));
INSERT INTO tmp2 (epoch) values (unixTimestampOf(now()));


select * from tmp2;
 epoch
---------------
 9410
 8171
 7595
 8746
 6009

Note: I have reduced the epochs to more readable numbers here by getting rid of the common prefix 148826477

I try to query this table based on epoch range:

Select * from tmp2 where epoch >= 6009 and epoch <= 7595;
InvalidRequest: Only EQ and IN relation are supported on the partition key (unless you use the token() function)

So I use the token() function and all is good:

Select * from tmp2 where token(epoch) <= token(6009) and token(epoch) >= token(7595);
 epoch
---------------
 7595
 8746
 6009

However, I am suspicious that using token is not actually giving me querying based on time because token(epoch) and token(epoch+100) might not have same linear relationship as epoch and epoch+100 have.
Example:

Select * from tmp2 where token(epoch) <= token(7009) and token(epoch) >= token(7595);

 epoch
-------
(0 rows)

Is there a way to query the epoch based partition column in chunks of 1-minute? OR
Is there a way to get rid of the milliseconds and seconds in the epoch while storing the data, so that the epoch's resolution is only a minute and I can specify the partition easily for minutes?

I was hoping to convert the epoch into minute within the CQL itself. Something like `INSERT INTO tmp2 (epoch) values (setResolutionToLastMinute(now()))`. Does CQL have something that will reduce the resolution of now() to a minute? — user2250246, Feb 28 '17 at 19:26

score 0 · Answer 1 · edited May 23 '17 at 12:17

However, I am suspicious that using token is not actually giving me querying based on time because token(epoch) and token(epoch+100) might not have same linear relationship as epoch and epoch+100 have.

Your suspicious are right. They are absolutely two very different things indeed. I recently posted explanation about the difference between querying by data and querying by the token function.

Back to your questions:

No, there's no way to perform such task, unless your partition key is already at 1-minute granularity
No, you have to round it on your client side code and insert it into your timestamp column.

If you just follow my advice on your previous question you'd solve in a blink of a eye.

I wish to add that a fundamental thing in cassandra is you really need to know the partition key when querying data, this is what the error says.

How to query a simple epoch-based partition column in Cassandra?

1 Answers1