We have a table with about 40k rows, querying on secondary index is slow(30 seconds on production). Our cassandra is 1.2.8. The table schema is as following:
CREATE TABLE usertask (
tid uuid PRIMARY KEY,
content text,
ts int
) WITH
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};
CREATE INDEX usertask_ts_idx ON usertask (ts);
When I turn on tracing, I notice there is a lot of lines like the following:
Executing single-partition query on usertask.usertask_ts_idx
With only 40k rows, it looks like there are some thousands of query on usertask_ts_idx. What could be the problem? Thanks
More investigation
I try the same query on our test server, it is much faster(30 seconds on prod, 1-2 seconds on test server). After comparing the tracing log, the difference is the time spending at seeking to partition indexed section in data file. On our production it takes 1000-3000 micro sec for each seek, on dev server it takes 100 micro seconds. I guess our production server has not enough memory to cache the data file so it is slow at seeking in data file.