3

As this doc suggests one can set a timeout when executing a SqlQuery by setting, https://ignite.apache.org/releases/2.4.0/javadoc/org/apache/ignite/cache/query/SqlQuery.html#setTimeout-int-java.util.concurrent.TimeUnit-

The doc for QueryCancelledException also mentions that the checked exception is thrown if a query was cancelled or timed out while executing, https://ignite.apache.org/releases/2.4.0/javadoc/org/apache/ignite/cache/query/QueryCancelledException.html

The same is mentioned here as a way to cancel/timeout long running queries, https://apacheignite-sql.readme.io/v2.4/docs/query-cancellation

But strangely the java doc for all of the IgniteCache.query(..) methods, https://ignite.apache.org/releases/2.4.0/javadoc/org/apache/ignite/IgniteCache.html#query-org.apache.ignite.cache.query.Query- does not declare this checked exception or for that matter any checked exception as being thrown (same with QueryCursor.getAll() method) resulting in confusion on where & how to code the handling for query timeouts.

I coded the below but am unable to make the query to time out to test that part of my code path quickly & see if its correct. I am hoping the exception will be thrown both in IgniteCache.query(..) method and in QueryCursor.getAll() & its related methods.

Apparently the minimum timeout granularity for SqlQuery.setTimeout(int timeout, TimeUnit timeUnit) is TimeUnit.MILLISECONDS which i realized during initial testing making it harder to force a timeout for testing.

Does the code below look right? (i want to avoid cursor methods & rely on IgniteCache.query(..) called inside the try-with-resources to detect timeout). Will this work?

@Scheduled(fixedDelayString = "${checkInterval}", initialDelayString = "${checkDelay}")
private final void monitorHealth() {
    if(!isReady) {
        return;
    }
    try (QueryCursor<Entry<Integer, FabricInfo>> cursor = fabricInfoCache.query(SQL_QUERY)) {
        cursor.iterator();
        // Reset the query time out counter..
        if(retryCount != 0) {
            retryCount = 0;
            LOGGER.warn("Client health check query executed without getting timed out before the configured maximum number of timeout retries was reached. Reseting retryCount to zero.");
        }
    } catch (Exception e) {
        if(e.getCause() instanceof QueryCancelledException) {
            retryCount++;
            LOGGER.warn("Client health check query timed out for the {} time.", retryCount);

            if(retryCount > QUERY_MAX_RETRIES_VALUE) {
                // Query timed out the maximum number of times..
                LOGGER.error("Client health check query timed out repeatedly for the maximum number of times configured : {}. Initating a disconnect-reconnect.", retryCount);
                reconnectAction();
            }
        } else {
            if (e.getCause() instanceof IgniteClientDisconnectedException) {
                LOGGER.error("Client health check query failed due to client node getting disconnected from cluster. Initating a disconnect-reconnect.", e.getCause());
            } else {
                // Treat other failures like CacheStoppedException, etc same as IgniteClientDisconnectedException...
                LOGGER.error("Client health check query failed. Initating a disconnect-reconnect.", e.getCause());
            }
            reconnectAction();
        }
    }
}

Thanks Muthu

lmk
  • 654
  • 5
  • 21
  • Updated to correct the code by adding "cursor.iterator();" inside of the try-with-resources block to make sure the exception would be thrown when the query times out as @Denis mentioned. – lmk Jul 16 '18 at 18:05

1 Answers1

5

QueryCancelledException is thrown from methods of QueryCursor, wrapped into IgniteException, which is a subclass of RuntimeException.

The query is not executed right after you call the IgniteCache#query(...) method. It only happens, when QueryCursor#iterator() method is called.

You can look, for example, at the following test in Ignite project, which checks, that query cancellation and timeouts are respected: IgniteCacheLocalQueryCancelOrTimeoutSelfTest.

Denis
  • 3,573
  • 8
  • 13
  • Thanks @Denis, i suspected this & you have clearly clarified it, i had tried doing a QueryCursor#getAll() inside of the try-with-resources block as-well, but couldn't get the query to timeout. I will try doing QueryCursor.#iterator() & update..thanks for the clarification & the links to the test code. – lmk Jul 16 '18 at 09:55
  • I still could not get the time outs to happen to test that code path, but i am proceeding to check it in based on your confirmation...i have edited the code above to add the QueryCursor.#iterator() inside of the try-with-resources block. Hope the code is correct.. – lmk Jul 16 '18 at 18:01