1

I have the spark-master and spark-worker running on SAP Kyma environment (different flavor Kubernetes) along with the Jupyter Lab with ample of CPU and RAM allocation.

I can access the Spark Master UI and see that workers are registered as well (screen shot below). enter image description here

I am using Python3 to submit the job (snippet below)

import pyspark

conf = pyspark.SparkConf()
conf.setMaster('spark://spark-master:7077')
sc = pyspark.SparkContext(conf=conf)
sc

and can see the spark context as output of the sc. After this, I am preparing the data to submit to the spark-master (snippet below)

enter image description here

words = 'the quick brown fox jumps over the lazy dog the quick brown fox jumps over the lazy dog'
seq = words.split()
data = sc.parallelize(seq)
counts = data.map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b).collect()
dict(counts)
sc.stop()

but it start to log warning messages on notebook(snippet below) and goes forever till I kill the process from spark-master UI.

22/01/27 19:42:39 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
22/01/27 19:42:54 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

I am new to Kyma (Kubernetes) and Spark. Any help would be much appreciated.

Thanks

Surya
  • 113
  • 8
  • The code is running on jupyterlab right? Is it also running in a pod? – sai Jan 30 '22 at 17:19
  • Yes the code is running on Jupyterlab which is also deployed as pod under same namespace. So, I can see the job submitted via Jupyterlab on spark UI but spark-workers could not process it. Though the workers are registered with master – Surya Feb 01 '22 at 04:54

1 Answers1

0

For those who stumble upon the same question.

Check your infrastructure certificate. Turned out that the Kubernetes was issuing wrong internal certificate which was not recognised by the pods.

After we fixed the certificate, all started working.

Surya
  • 113
  • 8