Handle correctly a big number of asynchronous queries

Question

I have to update several thousand of records in a Cassandra table and I use executeAsync(BoundStatement) method however I got the error Pool is busy (no available connection and the queue has reached its max size 256) see below the full details.$

What is the best way to correctly handle this type of execution? What about increasing the Cassandra query queue size and the wait time in the queue?

Exception in thread "Thread-6" com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cassandra2/172.18.0.17:9042 (com.datastax.driver.core.exceptions.BusyPoolException: [cassandra2/172.18.0.17] Pool is busy (no available connection and the queue has reached its max size 256)), cassandra4/172.18.0.18:9042 (com.datastax.driver.core.exceptions.BusyPoolException: [cassandra4/172.18.0.18] Pool is busy (no available connection and the queue has reached its max size 256)), cassandra1/172.18.0.11:9042 (com.datastax.driver.core.exceptions.BusyPoolException: [cassandra1/172.18.0.11] Pool is busy (no available connection and the queue has reached its max size 256)))
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:75)
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:28)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:28)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:236)

I use a cluster with 3 nodes, QUORUM consistency and replication factor of 3.

if there is too much load it can be retried almost infinitely since it fails too often. For example if you try to execute several batches of 1k updates for a queue of 256 elements it will fail almost directly and you don't know what are the executed queries and the failed one to exclude the success in your retry — Nicolas Henneaux, Apr 06 '18 at 05:39

score 2 · Accepted Answer · answered Apr 06 '18 at 15:08

2

There are two options:

Tune the connection pool to achieve maximum performance for your cluster
Throttle the number of async queries to avoid NoHostAvailableException. E.g. as described here

answered Apr 06 '18 at 15:08

Mikhail Baksheev

1,394
11
13

score 2 · Answer 2 · answered Apr 06 '18 at 16:40

You could increase your queue but that really just puts off the problem. You probably would want to just retry any such requests but it may be in your best interests to use a semaphore and only permit a certain number of requests to be inflight at a time rather than queueing up all the inserts at once.

You could also use a mechanism like this if ordering is important

Spark Cassandra Connector slidingIteartor for controlling the execution of many asynchronous requests

Handle correctly a big number of asynchronous queries

2 Answers2

Linked