cassandra sstable files are not getting compacted even when cluster is idle

Question

I've created a cluster of 3 cassandras and created a process to feed the cluster with data. The feed process is quite stressing the cluster with something around 10000 batches/sec and that runs for a couple of days continuously. So effectively as it is expected, cassandra creates a lot of sstable files and it almost continuously compacts those as well. But these files pile up and I currently have 300 of them on a 70GB/node database (200GB overall). But even if I stop the feed and the cluster is idle, they don't seem to continue the compaction and the number of files stay large. Is there a way to force cassandra to compact most of the files?

I am using leveled compaction, here is one of my tables:

CREATE TABLE data (
id bigint,
data blob,
PRIMARY KEY (id)
) WITH
bloom_filter_fp_chance=0.100000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
index_interval=128 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
default_time_to_live=0 AND
speculative_retry='99.0PERCENTILE' AND
memtable_flush_period_in_ms=0 AND
compaction={'class': 'LeveledCompactionStrategy'} AND
compression={'sstable_compression': 'LZ4Compressor'};

score 0 · Answer 1 · answered Apr 13 '14 at 14:05

0

Run a nodetool flush and then run a nodetool compact . Cassandra compacts when SSTable grows based beyond a threshold value you have mentioned in cassandra.yaml. By default, whenever the number of sstables grow to 4, it gets compacted to one.

answered Apr 13 '14 at 14:05

Ananth

971
9
23

I currently got around 100 sstables for the mentioned table, shouldn't those be compacted? – kostas.kougios Apr 13 '14 at 15:34
Run a compaction for the particular column family alone then. Having a larger SSTable too doesn't improve your read ops. Having 100 sstables or more than that will not impact your read performance as long as your schema is proper... – Ananth Apr 13 '14 at 15:37
ah, really performance won't be impacted? Is it due to bloom filters again? – kostas.kougios Apr 13 '14 at 16:41
I am getting bad read performance as my data grow larger. I am currently on 250 queries per sec on 200GB split in 3 nodes (the nodes run on the same box but different 7200rpm disks and the box has 16 cores). The above table is one of my two main tables, the other one is a time-series table which we discussed and I believe I've optimized: http://stackoverflow.com/questions/22792260/does-cassandra-read-the-whole-row-when-limiting-the-number-of-requested-results/22800464?noredirect=1#comment34865572_22800464 – kostas.kougios Apr 13 '14 at 17:06
You are generating an unique id and running this query. That means you are populating data like SQL. Kindly think of schema in terms of nosql. If you are growing a million or billion of rows, I feel your design had to scale for a data like Facebook . – Ananth Apr 14 '14 at 12:47
which table do you mean? This "data" table, or my other "t" time-series like table? – kostas.kougios Apr 14 '14 at 18:54
throughput of the "data" table is around 80/sec on a single threaded runner, "t" table is quite worse at 10/sec again single threaded. That means 100ms/query. "t" is quite bigger in GB compared to "data". – kostas.kougios Apr 14 '14 at 20:08
more weirdness: the commit log directories of the nodes in my cluster are massive: 214GB commitlog for 113GB data! – kostas.kougios Apr 14 '14 at 20:52
You are using Leveled compaction for a write heavy workload and expecting read performance to be excellent. I would like to model your schema properly based on your use case but I would suggest you to concentrate more on schema design. You are still thinking in terms of RDBMS . Don't think you can scale with a RDBMS model in nosql. This post might help you http://www.datastax.com/dev/blog/when-to-use-leveled-compaction – Ananth Apr 15 '14 at 01:58
Indeed I have a heavy write workload but on my latest read-benchmark I disable the writes. I only benchmarked the reads. Leveled compaction seems like a fit for what I do because it minimizes the number of sstables required to read a row. But still 3 nodes read avg 15 rows of my "t" table. My use case is to have a time series "t" table that is very similar to facebook posts. So it contains a userid as the partition key, time as a cluster key and the actual text. And I want to query it in a similar way that facebook wall works, so that I get the latest messages on top. – kostas.kougios Apr 15 '14 at 09:58
It's pretty standard table schema for cassandra, no? – kostas.kougios Apr 15 '14 at 09:59
You are scaling in rows . A facebook wall like feature is best suited with wide rows approach. – Ananth Apr 15 '14 at 11:13
what do you mean? I believe cql3 actually scales the table creating wide rows based on the clustering key, no? So for the partition key it creates 1 wide row that has many columns based on the clustering key. t.id is the partitioning key in my case , t.idx is the clustering key (which is converted internally to a wide row), correct? – kostas.kougios Apr 15 '14 at 11:52
so my schema is similar to the example here: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows – kostas.kougios Apr 15 '14 at 12:02
Kindly have a look at the same post what you have shared. You have an explicit clustering key definition. If clustering key is not used , you are simply creating row with the only id you have used. Result : No use of using leveled compaction either. – Ananth Apr 15 '14 at 12:17

cassandra sstable files are not getting compacted even when cluster is idle

1 Answers1