I'm trying to pull some time series data from Cassandra. The data is being put there by an external application and it is using KairosDB to write the data (based on OpenTSDB). I don't have the option of putting the KairosDB libraries into the application I'm developing, but the Datastax Cassandra libraries are already available to me.
How would I write the basic queries to retrieve the information directly from Cassandra in a usable way? Queries where I can specify start/end times and additional tags for the row key indices. Some of this might be basic Cassandra knowledge I'm missing, but there are issues specific to the schema like the fact that sample times are stored as relative values from an absolute timestamp in row_key_index, or the fact that data could be split across multiple row_key_index columns depending on the retention of data and requested time span.
If I query the data directly from the column families I get something like this:
cqlsh:kairosdb> select * from string_index;
key | column1 | value
----------------------------+--------------------------------------------+-------
0x7461675f76616c756573 | all | 0x
0x7461675f76616c756573 | data_points | 0x
0x7461675f76616c756573 | eui.3765353036366237 | 0x
0x7461675f76616c756573 | row_key_index | 0x
0x7461675f76616c756573 | string_index | 0x
0x6d65747269635f6e616d6573 | kairosdb.datastore.queries_waiting | 0x
0x6d65747269635f6e616d6573 | kairosdb.datastore.query_time | 0x
0x6d65747269635f6e616d6573 | kairosdb.datastore.write_size | 0x
if I do a query on the row_key_index table, the data is all stored as a blob. Doing a blob->ascii conversion gives:
cqlsh:kairosdb> select blobAsAscii(key) from row_key_index;
blobAsAscii(key)
------------------------------------
kairosdb.datastore.write_size
kairosdb.datastore.write_size
kairosdb.datastore.write_size
kairosdb.datastore.write_size
kairosdb.datastore.write_size
The same conversion on column1 gives:
cqlsh:kairosdb> select blobAsAscii(column1) from row_key_index;
blobAsAscii(column1)
----------------------------------------------------------------------------------------
kairosdb.datastore.write_size\x00\x00\x00\x01A\x0fw\xa0\x00buffer=data_points:host=JMOROSKI-DEV:
kairosdb.datastore.write_size\x00\x00\x00\x01A\x0fw\xa0\x00buffer=row_key_index:host=JMOROSKI-DEV:
kairosdb.datastore.write_size\x00\x00\x00\x01A\x0fw\xa0\x00buffer=string_index:host=JMOROSKI-DEV:
kairosdb.datastore.write_size\x00\x00\x00\x01A{\x9d,\x00buffer=data_points:host=JMOROSKI-DEV:
kairosdb.datastore.write_size\x00\x00\x00\x01A{\x9d,\x00buffer=row_key_index:host=JMOROSKI-DEV:
and data points:
cqlsh:kairosdb> select blobAsAscii(column1) from data_points limit 10;
blobAsAscii(column1)
----------------------
\x00\x00\x00\x00
\x00\x00\x9c@
\x00\x018\x80
\x00\x01\xd4\xc0
\x00\x02q\x00
\x00\x03\r@
\x00\x03\xa9\x80
\x00\x04E\xc0
\x00\x04\xe2\x00
\x00\x05~@
I don't see times. I'm expecting an absolute time in the row_key_index and relative offsets in the data_points. I don't see values at times. I don't see a way to query stats for a specific machine since that information seems to be embedded into tags in row_key_index.column1 and I'm not aware of any LIKE or range slice equivalents in cql.