We are running Cassandra 3.0.16 on a cluster of i3.2xl instances in AWS. The volumes that store data are encrypted using Luks. We are running a job that needs to read 3TB of data from two tables by running individual queries on single record keys. If we watch Cloudwatch IO metrics for one of the Cassandra instances, it looks like Cassandra will read 1000's of terabytes before the job will finish. This is causing the job duration to be 6x slower than expected.
We have fully compacted the two tables being read and it only helped performance improve by 10%. We have ruled out encryption causing slowness by seeing the same slow performance on a cluster that does not have volumes encrypted.
Are there any Cassandra configuration settings that can be tuned to reduce excessive IO?