I am running a Spark job writing to an Alluxio cluster with 20 workers (Alluxio 1.6.1). Spark job failed to write its output due to alluxio.exception.status.DeadlineExceededException
. The worker is still alive from Alluxio WebUI. How can I avoid this failure?
alluxio.exception.status.DeadlineExceededException: Timeout writing to WorkerNetAddress{host=spark-74-44.xxxx, rpcPort=51998, dataPort=51999, webPort=51997, domainSocketPath=} for request type: ALLUXIO_BLOCK
id: 3209355843338240
tier: 0
worker_group {
host: "spark6-64-156.xxxx"
rpc_port: 51998
data_port: 51999
web_port: 51997
socket_path: ""
}