Unable to process sample word count as Spark job

Question

I have the spark-master and spark-worker running on SAP Kyma environment (different flavor Kubernetes) along with the Jupyter Lab with ample of CPU and RAM allocation.

I can access the Spark Master UI and see that workers are registered as well (screen shot below).

I am using Python3 to submit the job (snippet below)

import pyspark

conf = pyspark.SparkConf()
conf.setMaster('spark://spark-master:7077')
sc = pyspark.SparkContext(conf=conf)
sc

and can see the spark context as output of the sc. After this, I am preparing the data to submit to the spark-master (snippet below)

words = 'the quick brown fox jumps over the lazy dog the quick brown fox jumps over the lazy dog'
seq = words.split()
data = sc.parallelize(seq)
counts = data.map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b).collect()
dict(counts)
sc.stop()

but it start to log warning messages on notebook(snippet below) and goes forever till I kill the process from spark-master UI.

22/01/27 19:42:39 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
22/01/27 19:42:54 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

I am new to Kyma (Kubernetes) and Spark. Any help would be much appreciated.

Thanks

The code is running on jupyterlab right? Is it also running in a pod? — sai, Jan 30 '22 at 17:19
Yes the code is running on Jupyterlab which is also deployed as pod under same namespace. So, I can see the job submitted via Jupyterlab on spark UI but spark-workers could not process it. Though the workers are registered with master — Surya, Feb 01 '22 at 04:54

score 0 · Accepted Answer · answered Oct 29 '22 at 16:30

0

For those who stumble upon the same question.

Check your infrastructure certificate. Turned out that the Kubernetes was issuing wrong internal certificate which was not recognised by the pods.

After we fixed the certificate, all started working.

answered Oct 29 '22 at 16:30

Surya

113
8

Unable to process sample word count as Spark job

1 Answers1