I have the spark-master and spark-worker running on SAP Kyma environment (different flavor Kubernetes) along with the Jupyter Lab with ample of CPU and RAM allocation.
I can access the Spark Master UI and see that workers are registered as well (screen shot below).
I am using Python3 to submit the job (snippet below)
import pyspark
conf = pyspark.SparkConf()
conf.setMaster('spark://spark-master:7077')
sc = pyspark.SparkContext(conf=conf)
sc
and can see the spark context as output of the sc
. After this, I am preparing the data to submit to the spark-master (snippet below)
words = 'the quick brown fox jumps over the lazy dog the quick brown fox jumps over the lazy dog'
seq = words.split()
data = sc.parallelize(seq)
counts = data.map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b).collect()
dict(counts)
sc.stop()
but it start to log warning messages on notebook(snippet below) and goes forever till I kill the process from spark-master UI.
22/01/27 19:42:39 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
22/01/27 19:42:54 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
I am new to Kyma (Kubernetes) and Spark. Any help would be much appreciated.
Thanks