1

Note: I am running code in a cluster with 16 slaves, HPCC version 6.4.40

I am running some ECL code that returns this error:

System error: 0: Graph graph2[14], SLAVE #1 [10.313.316.31:20100]: Error receiving actinit data for graph: 14

What does this error exactly indicates?

Myeclagent.log

Is it maybe running out of memory?

In the thor master log just before the exception I can see there are two lines of log, first one starting by NIC (Network Interface?) and other with SYST (System?) Values doesn't seem to change drastically: enter image description here

Oscar Foley
  • 6,817
  • 8
  • 57
  • 90

1 Answers1

1

From the development team:

Why you are seeing that error:

There are a lot of logical files at the same scope level, causing significant access (lookup) slowdowns, ultimately meaning that if there are 100's or 1000's being looked up for a single read, it is exceeding the timeout.

Scope's with a lot of logical files at the same level like this used to be a pain point for Dali and clients accessing files within them. Basically, it caused each lookup to perform a linear search through the scope for match. NB: that was fixed some years ago (in 7.12.0)

So my guess is that the # of files in scopes being accessed by this query (that haven't been rolled up?) have grown and are now causing the cumulative time to look them up to exceed the [25 minute] timeout.

Recommend you rollup your files and/or upgrade your cluster as soon as possible. The current gold release is now up to Version 9.

Hope this helps,

Bob

Bob Foreman
  • 164
  • 7