Identifying the large partition in cassandra

Question

This is a continuation to an earlier question that I had asked. NoSpamLogger.java Maximum memory usage reached Cassandra

Based on that thread I re-partitioned my data to be minute-wise instead of hourly. This has improved the stats for that table.

Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                              (micros)          (micros)           (bytes)                  
50%            35.00             73.46        3449259.15            219342                72
75%            35.00             88.15        4966933.18            943127               258
95%            35.00            182.79        4966933.18           5839588              1109
98%            35.00            315.85        4966933.18          17436917              3973
99%            35.00            379.02        4966933.18          36157190             11864
Min            30.00             20.50        2874382.63                51                 0
Max            35.00           2346.80        4966933.18        3449259151           2816159

If you notice the 99th percentile is less than 40MB, but the max sized partition is still reported to be 3.44GB.

Also I continue to see 'Maxiumum memory usage reached' error in the system.log every couple days after a cluster restart.

So I am trying to hunt down the partitions that are reportedly large. How can I find these?

You may have only 1 wide row of 3.4GB. Check your system.log using something like `grep "Compacting large partition"`. Otherwise it's not trivial to find wide rows aka large partitions. — LHWizard, Nov 21 '17 at 18:08

Identifying the large partition in cassandra

0 Answers0