The Problem
Simple CQL select failing when I have a large data load.
Setup
I am using the following Cassandra schema:
CREATE KEYSPACE fv WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
create table entity_by_identifier (
identifier text,
state entity_state,
PRIMARY KEY(identifier)
);
CREATE TYPE entity_state,(
identifier text,
number1 int,
number2 double,
entity_type text,
string1 text,
string2 text
);
The query I am trying to execute:
SELECT * FROM fv.entity_by_identifier WHERE identifier=:identifier;
The Issue
This query works fine in a small dataset (tried with 500 rows). However, with a large data load test, I am creating over 5million rows in this table before proceeding to execute this query multiple times (10 threads continuously performing this query for 1 hour).
Once the data load has completed, the queries begin but immediately fail with the following error:
com.datastax.driver.core.exceptions.ReadFailureException: Cassandra failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded, 1 failed)
at com.datastax.driver.core.exceptions.ReadFailureException.copy(ReadFailureException.java:85)
at com.datastax.driver.core.exceptions.ReadFailureException.copy(ReadFailureException.java:27)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64)
...my calling classes...
I've checked the Cassandra log and found only this exception:
java.lang.AssertionError: null
at org.apache.cassandra.db.rows.BTreeRow.getCell(BTreeRow.java:212) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.canRemoveRow(SinglePartitionReadCommand.java:899) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.reduceFilter(SinglePartitionReadCommand.java:863) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndSSTablesInTimestampOrder(SinglePartitionReadCommand.java:748) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:519) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:496) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:358) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:366) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1797) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2466) ~[apache-cassandra-3.7.jar:3.7]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.SEPExecutor.maybeExecuteImmediately(SEPExecutor.java:192) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:117) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:85) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor$NeverSpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:214) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1702) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1657) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1604) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1523) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.execute(SinglePartitionReadCommand.java:335) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:67) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.pager.SinglePartitionPager.fetchPage(SinglePartitionPager.java:34) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement$Pager$NormalPager.fetchPage(SelectStatement.java:325) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:361) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:237) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:78) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:486) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:463) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401) [apache-cassandra-3.7.jar:3.7]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:32) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:283) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.7.jar:3.7]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
As you can see I am using Cassandra 3.7. The Datastax driver in use is version 3.1.0.
Any ideas why the larger data set could cause this error?