1

I have problem with Cassandra ReadTimeOuts. Scenario:

  • 3 GB data loaded to Cassandra,
  • 9 nodes of Cassandra's within 1 DataCenter,
  • Replication equals 3,
  • Consistency level equals 1,
  • Cassandra version 2.2.9

link to cassandra.yaml https://pastebin.com/x0bF7nLf

Tests: For Testing I am using jmeter plug-in for Cassandra. Request is a select with condition for row ID within provided list of ID's. List always contains 100 ID's. Each request should always return 100 rows (all ID's are in database). ID's are random so cache role is reduced.

Sample select:

select * from price.item_vat_posting_group where no in ('B7B7A6','B2DD05','A34751','B4BC7D','C0BB53','D07DCB','C03716','BB99DF','A975C2','C2AE27','AF621C','242448','B30CDA','508336','B44D6B','D07422','AC44EA','C6F34D','9B25AC','C4CF12','AC25BD','C3D9C7','AE7DB2','C5E03E','BF7AC1','B499B5','A7787E','645180','A9BEFE','AFFEA4','A88955','D95B50','B0F9FC','C09174','253953','9ED9CA','CAF896','536951','214502','427776','DA14CB','422282','A4B10A','C56BF5','B373E0','D171EF','C70607','B350AB','9D809B','586563','BF6308','A4BF5A','C42716','C3261C','C45B79','C6FE55','D1F0D4','C483B5','A67D59','DC5898','9BACAD','D9C6B0','D17DAE','D8D4F3','A05946','BBEBA8','A87B37','A13E97','BB7099','A3FC26','C461DF','309810','BF6306','D07603','C59F70','C5906C','A515ED','B50056','A8390E','A0CCC7','BF2713','C6EC7D','D7EB9D','A5D5EB','984076','D88F44','257058','D61635','D40CDE','B0A347','B7617F','D6277E','B4286F','C41F99','D84232','DC1636','BFF15D','DD0972','9B3138');

Scenario 1. While sending requests by 100 threads in 10 minutes time Cassandra has 5% ReadTimeOuts for total number of handled requests. Average request time is 100 ms. Processor usage on each Cassandra node is between 40% - 50%.

Scenario 2. While sending requests by 4 threads in 24 hours time, about 10 ReadTimeOuts occurs per 100 000 requests. Processor usage on each Cassandra node is 5%.

In both scenarios Garbage Collector works less then 300 ms.

Error message:

Cassandra time-out during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded) 
com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded) 
at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69) 
at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:258)

Some statistics:

QUESTION:

Is that typical for Cassandra or am I doing something wrong ?

pawski
  • 13
  • 3

2 Answers2

4

You are using in query, In query put a lot of presser in the coordinator node. When you execute in query, this means that you’re waiting on this single coordinator node to give you a response, it’s keeping all those queries and their responses in the heap, and if one of those queries fails, or the coordinator fails, you have to retry the whole thing.

Instead of using in query use executeAsync with separate query for each no. Now if one query failed retry requires only one small fast query.

Or

Change your data model so that you can specify partition key when using in query.

Note : To much executeAsync at a time can also put presser on your cluster. Check this link https://stackoverflow.com/a/30526719/2320144

More : https://lostechies.com/ryansvihla/2014/09/22/cassandra-query-patterns-not-using-the-in-query-for-multiple-partitions/

Community
  • 1
  • 1
Ashraful Islam
  • 12,470
  • 3
  • 32
  • 53
  • thanks for your replay. But even when I have eliminated "in query", still have ReadTimeOuts. While sending requests by 6000 threads in 12 hours time I get 0,01 % timeouts. Is that normal? – pawski Apr 10 '17 at 11:26
  • Seems Ok, But if it's not okay for you you can Change your data model so that you can specify partition key and execute range query. – Ashraful Islam Apr 10 '17 at 11:32
  • My data model is very simple. Just a few attributes in one table. In the end: ONE REQUEST equals ONE query from ONE table. – pawski Apr 10 '17 at 12:31
  • any tips ? I have no idea what might be the reason. – pawski Apr 11 '17 at 07:10
0

Your query isn't efficient because you scan lot of partition. Each partition stored in different node. You should scan one or less 10 partitions with range condition. Change your data model, check theses links :

https://www.datastax.com/dev/blog/the-most-important-thing-to-know-in-cassandra-data-modeling-the-primary-key

https://www.datastax.com/dev/blog/a-deep-look-to-the-cql-where-clause

V.HL
  • 80
  • 6