We have a 24 node AWS cluster (i2.xlarge) running Cassandra 2.2.5. We have one large table and a few smaller ones. The large table consumes most of the disk space. Disk usage is unexpectedly increasing.
We are using LCS and have noticed that the SSTables are not getting moved into higher levels.
cfstats on the table shows us that the SSTables do not appear to be getting compacted into higher levels:
SSTables in each level: [2, 20/10, 206/100, 2146/1000, 1291, 0, 0, 0, 0]
The dataset finished loading about a month ago and the disk usage was 60-65%. We are updating the dataset and disk usage is going up by about 0.5% per day. We are currently seeing 75-80% full on the nodes. Rows are being updated, but there are no new rows and no rows being deleted. So we did not expect that the disk usage would be going up. Our best guess is that compactions are no longer removing duplicates from the sstables.
When trying to force a compaction on the dataset (nodetool compact), we get an error about insufficient disk space.
"error: Not enough space for compaction, estimated sstables = 1977, expected write size = 331746061359"
Documentation on LCS claims that "Only enough space for 10x the sstable size needs to be reserved for temporary use by compaction." In our case it looks like the compaction requires 1977x 160MB.
We did come across a suggestion to reset LCS compaction levels: Leveled Compaction Strategy with low disk space
However, when we tried this on a smaller cluster with a smaller dataset with the same issue, the compactions it proceeded with appear to need a huge amount of space also, not just the 1.6G promised.
Before:
SSTables in each level: [1, 20/10, 202/100, 7, 0, 0, 0, 0, 0]
Space used (live): 38202690995
After executing sstablelevelreset:
SSTables in each level: [231/4, 0, 0, 0, 0, 0, 0, 0, 0]
Space used (live): 38258539433
The first compaction after that started compressing 21698490019 bytes. That appears to be about 129 sstables worth of data.
On the small cluster we have enough extra disk space, but on the big one, there does not appear to be enough room to either force a compaction or to get compactions to start over by using the sstablelevelreset utility.
After the compactions finished, this is what the sstable levels look like (note that documents are continually being updated, but not added to the database):
SSTables in each level: [0, 22/10, 202/100, 13, 0, 0, 0, 0, 0]
Space used (live): 39512481279
Is there anything else we can do to try and recover disk space? Or at least to keep the disk usage from climbing?
The table is defined as follows:
CREATE TABLE overlordnightly.document (
id bigint PRIMARY KEY,
del boolean,
doc text,
ver bigint
) WITH bloom_filter_fp_chance = 0.1
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.DeflateCompressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
Full cfstats from one of the nodes:
Keyspace: overlordprod
Read Count: 68000539
Read Latency: 3.948187530190018 ms.
Write Count: 38569748
Write Latency: 0.02441453179834102 ms.
Pending Flushes: 0
Table: document
SSTable count: 3283
SSTables in each level: [0, 22/10, 210/100, 2106/1000, 943, 0, 0, 0, 0]
Space used (live): 526180595946
Space used (total): 526180595946
Space used by snapshots (total): 0
Off heap memory used (total): 2694759044
SSTable Compression Ratio: 0.22186642596102463
Number of keys (estimate): 118246721
Memtable cell count: 45944
Memtable data size: 512614744
Memtable off heap memory used: 0
Memtable switch count: 1994
Local read count: 68000545
Local read latency: 4.332 ms
Local write count: 38569754
Local write latency: 0.027 ms
Pending flushes: 0
Bloom filter false positives: 526
Bloom filter false ratio: 0.00000
Bloom filter space used: 2383928304
Bloom filter off heap memory used: 2383902040
Index summary off heap memory used: 24448020
Compression metadata off heap memory used: 286408984
Compacted partition minimum bytes: 87
Compacted partition maximum bytes: 12108970
Compacted partition mean bytes: 16466
Average live cells per slice (last five minutes): 1.0
Maximum live cells per slice (last five minutes): 1
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
Here's something that appears wrong about the compactions that are occurring. Here's one in particular:
DEBUG [CompactionExecutor:1146] 2016-07-26 08:49:02,333 CompactionTask.java:142 - Compacting (cd2baa50-530d-11e6-9c8e-b5e6d88d6e11) [
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12943-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12970-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12972-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12953-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12955-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12957-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12978-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12976-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-4580-big-Data.db:level=4,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-14528-big-Data.db:level=2,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12949-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12959-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12974-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12962-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-11516-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12941-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12968-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12951-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12983-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12947-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12966-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12945-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12964-big-Data.db:level=3,
]
If you notice there are 23 sstables being compacted: one from level 2, one from level 4, and the rest from level 3. In this case, it also needed more that 10x space (3,720,676,532 bytes to 3,531,157,508). It ends up compacting these to level 3, but I was under the impression that tables only went up in level. Why is a level 4 table being compacted to level 3? Now that I've noticed this in the logs, I see that it's a frequent occurrence. For example, here's another from around the same time:
DEBUG [CompactionExecutor:1140] 2016-07-26 08:46:47,420 CompactionTask.java:142 - Compacting (7cbb0390-530d-11e6-9c8e-b5e6d88d6e11) [
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12910-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-14524-big-Data.db:level=2,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12908-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-12906-big-Data.db:level=3,
/data/cassandra/overlordprod/document-57ed497007c111e6a2174fb91d61e383/la-3543-big-Data.db:level=4,
]
I don't know if this is a problem or not.