I have a GCP Dataproc spark cluster, and I'm running a Spark (Structured Streaming) program which reads from Kafka, does processing & finally puts data into multiple sinks (Kafka, Mongo).
Dataproc cluster has the 1 master, 3 worker nodes (n1-highmem-16)
I'm getting error - No available nodes reported, error shown below
22/08/30 01:56:48 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
22/08/30 01:56:51 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
22/08/30 01:56:54 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
22/08/30 01:56:57 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
22/08/30 01:57:00 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
22/08/30 01:57:03 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
22/08/30 01:57:07 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
22/08/30 01:57:10 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
22/08/30 01:57:13 WARN org.apache.spark.deploy.yarn.YarnAllocatorNodeHealthTracker: No available nodes reported, please check Resource Manager.
Here is the command used to create the Dataproc cluster :
TYPE=n1-highmem-16
CNAME=versa-structured-stream-prom
BUCKET=dataproc-spark-logs
REGION=us-east1
ZONE=us-east1-b
IMG_VERSION=2.0-ubuntu18
PROJECT=versa-sml-googl
NUM_WORKER=3
gcloud compute routers nats create dataproc-nat-spark-kafka --nat-all-subnet-ip-ranges --router=dataproc-router --auto-allocate-nat-external-ips --region=us-east1
# in versa-sml-googl
gcloud beta dataproc clusters create $CNAME \
--enable-component-gateway \
--bucket $BUCKET \
--region $REGION \
--zone $ZONE \
--no-address --master-machine-type $TYPE \
--master-boot-disk-size 500 \
--master-boot-disk-type pd-ssd \
--num-workers $NUM_WORKER \
--worker-machine-type $TYPE \
--worker-boot-disk-type pd-ssd \
--worker-boot-disk-size 1000 \
--image-version $IMG_VERSION \
--scopes 'https://www.googleapis.com/auth/cloud-platform' \
--project $PROJECT \
--initialization-actions 'gs://dataproc-spark-configs/pip_install.sh','gs://dataproc-spark-configs/connectors-feb1.sh','gs://dataproc-spark-configs/prometheus.sh' \
--metadata 'gcs-connector-version=2.0.0' \
--metadata 'bigquery-connector-version=1.2.0' \
--properties 'dataproc:dataproc.logging.stackdriver.job.driver.enable=true,dataproc:job.history.to-gcs.enabled=true,spark:spark.dynamicAllocation.enabled=true,spark:spark.eventLog.dir=gs://dataproc-spark-logs/joblogs,spark:spark.history.fs.logDirectory=gs://dataproc-spark-logs/joblogs'
command to launch the Structured Streaming job:
gcloud dataproc jobs submit pyspark main.py \
--cluster $CLUSTER \
--properties ^#^spark.jars.packages=org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.2,org.mongodb.spark:mongo-spark-connector_2.12:3.0.2#spark.dynamicAllocation.enabled=true#spark.shuffle.service.enabled=true#spark.sql.autoBroadcastJoinThreshold=150m#spark.ui.prometheus.enabled=true#spark.kubernetes.driver.annotation.prometheus.io/scrape=true#spark.kubernetes.driver.annotation.prometheus.io/path=/metrics/executors/prometheus/#spark.kubernetes.driver.annotation.prometheus.io/port=4040#spark.app.name=structuredstreaming-versa\
--jars=gs://dataproc-spark-jars/spark-avro_2.12-3.1.3.jar,gs://dataproc-spark-jars/isolation-forest_2.4.3_2.12-2.0.8.jar,gs://dataproc-spark-jars/spark-bigquery-with-dependencies_2.12-0.23.2.jar,gs://dataproc-spark-jars/mongo-spark-connector_2.12-3.0.2.jar,gs://dataproc-spark-jars/bson-4.0.5.jar,gs://dataproc-spark-jars/mongodb-driver-sync-4.0.5.jar,gs://dataproc-spark-jars/mongodb-driver-core-4.0.5.jar \
--files=gs://kafka-certs/versa-kafka-gke-ca.p12,gs://kafka-certs/syslog-vani-noacl.p12,gs://kafka-certs/alarm-compression-user.p12,gs://kafka-certs/appstats-user.p12,gs://kafka-certs/insights-user.p12,gs://kafka-certs/intfutil-user.p12,gs://kafka-certs/reloadpred-chkpoint-user.p12,gs://kafka-certs/reloadpred-user.p12,gs://dataproc-spark-configs/metrics.properties,gs://dataproc-spark-configs/params.cfg,gs://kafka-certs/appstat-anomaly-user.p12,gs://kafka-certs/appstat-agg-user.p12,gs://kafka-certs/alarmblock-user.p12 \
--region us-east1 \
--py-files streams.zip,utils.zip
status of the disk space on masternode :
karanalang@versa-structured-stream-prom-m:~$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 52G 0 52G 0% /dev
tmpfs 11G 1.2M 11G 1% /run
/dev/sda1 485G 14G 472G 3% /
tmpfs 52G 0 52G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 52G 0 52G 0% /sys/fs/cgroup
/dev/loop0 303M 303M 0 100% /snap/google-cloud-cli/56
/dev/loop1 47M 47M 0 100% /snap/snapd/16292
/dev/loop2 56M 56M 0 100% /snap/core18/2538
/dev/sda15 105M 4.4M 100M 5% /boot/efi
tmpfs 11G 0 11G 0% /run/user/113
tmpfs 11G 0 11G 0% /run/user/114
tmpfs 11G 0 11G 0% /run/user/116
tmpfs 11G 0 11G 0% /run/user/112
tmpfs 11G 0 11G 0% /run/user/117
/dev/loop3 304M 304M 0 100% /snap/google-cloud-cli/62
tmpfs 11G 0 11G 0% /run/user/1008
/dev/loop0, /dev/loop1 etc are showing as 100% Use, what kind of data is stored in this ?
I'm trying to understand what is causing the issue .. and how to fix this ? any ideas on this ?
tia !
Pls note : In terms of volume being processed it is ~4.5 M messages every 10 minutes (syslog messages are received every 10 minutes), and it takes ~2.5 - 3 minutes to process the data.
UPDATE : Just ran this command
karanalang@versa-structured-stream-prom-m:~$ hdfs dfsadmin -report -live -decommissioning
Configured Capacity: 3121327865856 (2.84 TB)
Present Capacity: 306192543744 (285.16 GB)
DFS Remaining: 306034360320 (285.02 GB)
DFS Used: 158183424 (150.86 MB)
DFS Used%: 0.05%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 10.142.0.23:9866 (versa-structured-stream-prom-w-0.c.versa-sml-googl.internal)
Hostname: versa-structured-stream-prom-w-0.c.versa-sml-googl.internal
Decommission Status : Normal
Configured Capacity: 1040442621952 (968.99 GB)
DFS Used: 7831552 (7.47 MB)
Non DFS Used: 937229017088 (872.86 GB)
DFS Remaining: 103188996096 (96.10 GB)
DFS Used%: 0.00%
DFS Remaining%: 9.92%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Tue Aug 30 05:36:06 UTC 2022
Last Block Report: Tue Aug 30 01:14:56 UTC 2022
Num of Blocks: 26
Name: 10.142.0.36:9866 (versa-structured-stream-prom-w-1.c.versa-sml-googl.internal)
Hostname: versa-structured-stream-prom-w-1.c.versa-sml-googl.internal
Decommission Status : Normal
Configured Capacity: 1040442621952 (968.99 GB)
DFS Used: 72830976 (69.46 MB)
Non DFS Used: 941126131712 (876.49 GB)
DFS Remaining: 99226882048 (92.41 GB)
DFS Used%: 0.01%
DFS Remaining%: 9.54%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Tue Aug 30 05:36:07 UTC 2022
Last Block Report: Tue Aug 30 04:21:31 UTC 2022
Num of Blocks: 26
Name: 10.142.0.8:9866 (versa-structured-stream-prom-w-2.c.versa-sml-googl.internal)
Hostname: versa-structured-stream-prom-w-2.c.versa-sml-googl.internal
Decommission Status : Normal
Configured Capacity: 1040442621952 (968.99 GB)
DFS Used: 77520896 (73.93 MB)
Non DFS Used: 936729841664 (872.40 GB)
DFS Remaining: 103618482176 (96.50 GB)
DFS Used%: 0.01%
DFS Remaining%: 9.96%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Tue Aug 30 05:36:07 UTC 2022
Last Block Report: Tue Aug 30 02:26:52 UTC 2022
Num of Blocks: 30
Decommissioning datanodes (0):
the worker nodes show -> Cache Used%: 100.00% Cache Remaining%: 0.00%
Also, almost all the 1000GB is used up, is this what is causing the issue ? Why is the cache not getting free'd up ?
Also, when i run the following command :
curl "http://${HOSTNAME}:8088/ws/v1/cluster/nodes"
I see this error :
"healthReport":"1/1 local-dirs usable space is below configured utilization percentage/no more usable space [ /hadoop/yarn/nm-local-dir : used space above threshold of 90.0% ] ; 1/1 log-dirs usable space is below configured utilization percentage/no more usable space [ /var/log/hadoop-yarn/userlogs : used space above threshold of 90.0% ]
Question is - What is causing the issue, and how do i fix this ?
UPDATE : Error i see in the yarn logs :
karanalang@versa-structured-stream-w-0:~$ yarn logs -applicationId application_1662411638799_0005 -log_files stderr --size -100
2022-09-05 22:28:02,123 INFO client.RMProxy: Connecting to ResourceManager at versa-structured-stream-m/10.142.0.103:8032
2022-09-05 22:28:02,543 INFO client.AHSProxy: Connecting to Application History server at versa-structured-stream-m/10.142.0.103:10200
Container: container_1662411638799_0005_01_000002 on versa-structured-stream-w-2.c.versa-sml-googl.internal:8026
LogAggregationType: LOCAL
================================================================================================================
LogType:stderr
LogLastModifiedTime:Mon Sep 05 22:28:03 +0000 2022
LogLength:68980134141
LogContents:
rruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
End of LogType:stderr.This log file belongs to a running container (container_1662411638799_0005_01_000002) and so may not be complete.
***********************************************************************
Container: container_1662411638799_0005_01_000003 on versa-structured-stream-w-1.c.versa-sml-googl.internal:8026
LogAggregationType: LOCAL
================================================================================================================
LogType:stderr
LogLastModifiedTime:Mon Sep 05 21:37:08 +0000 2022
LogLength:636166940
LogContents:
2/09/05 21:37:08 ERROR org.apache.spark.executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
End of LogType:stderr.
***********************************************************************
Container: container_1662411638799_0005_01_000001 on versa-structured-stream-w-0.c.versa-sml-googl.internal:8026
LogAggregationType: LOCAL
================================================================================================================
LogType:stderr
LogLastModifiedTime:Mon Sep 05 22:23:34 +0000 2022
LogLength:82194130445
LogContents:
cutor: Exception in task 2.0 in stage 425.0 (TID 1847): Python worker exited unexpectedly (crashed)
End of LogType:stderr.
***********************************************************************
karanalang@versa-structured-stream-w-0:~$ yarn logs -applicationId application_1662411638799_0005 -log_files stderr --size -300
2022-09-05 22:28:22,626 INFO client.RMProxy: Connecting to ResourceManager at versa-structured-stream-m/10.142.0.103:8032
2022-09-05 22:28:23,053 INFO client.AHSProxy: Connecting to Application History server at versa-structured-stream-m/10.142.0.103:10200
Container: container_1662411638799_0005_01_000002 on versa-structured-stream-w-2.c.versa-sml-googl.internal:8026
LogAggregationType: LOCAL
================================================================================================================
LogType:stderr
LogLastModifiedTime:Mon Sep 05 22:28:23 +0000 2022
LogLength:70328318158
LogContents:
ets=21143', ' mstatsTotRecvdOctets=22546', ' mstatsTotSessDuration=300000', '22/09/05 22:28:23 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
End of LogType:stderr.This log file belongs to a running container (container_1662411638799_0005_01_000002) and so may not be complete.
***********************************************************************
Container: container_1662411638799_0005_01_000003 on versa-structured-stream-w-1.c.versa-sml-googl.internal:8026
LogAggregationType: LOCAL
================================================================================================================
LogType:stderr
LogLastModifiedTime:Mon Sep 05 21:37:08 +0000 2022
LogLength:636166940
LogContents:
c1da28e-594190416-executor-3, groupId=spark-kafka-source-a72e590c-84a9-4f54-850f-9c00fc1da28e-594190416-executor] Resetting offset for partition syslog.ueba-us4.v1.versa.demo3-2 to offset 340737332.
22/09/05 21:37:08 ERROR org.apache.spark.executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
End of LogType:stderr.
***********************************************************************
Container: container_1662411638799_0005_01_000001 on versa-structured-stream-w-0.c.versa-sml-googl.internal:8026
LogAggregationType: LOCAL
================================================================================================================
LogType:stderr
LogLastModifiedTime:Mon Sep 05 22:23:34 +0000 2022
LogLength:82194130445
LogContents:
scala:232)
22/09/05 22:23:34 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 425.0 (TID 1845): Broken pipe (Write failed)
22/09/05 22:23:34 ERROR org.apache.spark.executor.Executor: Exception in task 2.0 in stage 425.0 (TID 1847): Python worker exited unexpectedly (crashed)
End of LogType:stderr.
***********************************************************************
Per note from @Dagang, the /var/log/hadoop-yarn/userlogs/containerId/stderr is filling up fast, primarily couple of errors I see in stderr
java.lang.IllegalStateException: Buffer overflow when available data size (16384) >= application buffer size (16384)
at org.apache.kafka.common.network.SslTransportLayer.read(SslTransportLayer.java:597)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:95)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:447)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:397)
at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:678)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:580)
at org.apache.kafka.common.network.Selector.poll(Selector.java:485)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:544)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:265)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:227)
at org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1760)
at org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1718)
at org.apache.spark.sql.kafka010.consumer.InternalKafkaConsumer.getAvailableOffsetRange(KafkaDataConsumer.scala:109)
at org.apache.spark.sql.kafka010.consumer.InternalKafkaConsumer.fetch(KafkaDataConsumer.scala:83)
at org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.fetchData(KafkaDataConsumer.scala:539)
at org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.fetchRecord(KafkaDataConsumer.scala:463)
at org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.$anonfun$get$1(KafkaDataConsumer.scala:310)
at org.apache.spark.util.UninterruptibleThread.runUninterruptibly(UninterruptibleThread.scala:77)
at org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.runUninterruptiblyIfPossible(KafkaDataConsumer.scala:604)
at org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.get(KafkaDataConsumer.scala:287)
at org.apache.spark.sql.kafka010.KafkaBatchPartitionReader.next(KafkaBatchPartitionReader.scala:63)
at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:79)
at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:112)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithoutKey_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:134)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:498)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:501)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
2nd error (actually a warning), probably related to pyspark UDFs ?
when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894
22/09/06 00:34:17 WARN org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer: KafkaDataConsumer is not running in UninterruptibleThread. It may hang when KafkaDataConsumer's methods are interrupted because of KAFKA-1894