1

I have a Kubernetes cluster composed of only one VM (minikube cluster).

On this cluster, I have a Spark Master and two Workers running. I have set up the Ingress addon in the following way (My spark components use the default ports) :

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: minikube-ingress
  annotations:
spec:
  rules:
  - host: spark-kubernetes
    http:
      paths:
      - path: /web-ui
        backend:
          serviceName: spark-master
          servicePort: 8080
      - path: /
        backend:
          serviceName: spark-master
          servicePort: 7077

And I added my k8s IP in my /etc/hosts

[MINIKUBE_IP] spark-kubernetes

I am able to connect to the Master webui through http://spark-kubernetes/web-ui : enter image description here

I now want to submit a JAR stored on my local machine (the spark-examples for example). I expected this command to work :

./bin/spark-submit \
    --master spark://spark-kubernetes \
    --deploy-mode cluster \
    --class org.apache.spark.examples.SparkPi \
     ./examples/jars/spark-examples_2.11-2.4.0.jar

But I get the following error :

2019-04-04 08:52:36 WARN  SparkSubmit$$anon$2:87 - Failed to load .
java.lang.ClassNotFoundException: 
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.spark.util.Utils$.classForName(Utils.scala:238)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:810)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

What have I done wrong?

Note :

  • I know that with Spark 2.4 I can have a cluster without a Master and submit directly to the k8s, but I want to do it with a master for now
  • I use Spark 2.4
  • I use Kubernetes 1.14
Rico
  • 58,485
  • 12
  • 111
  • 141
Nakeuh
  • 1,757
  • 3
  • 26
  • 65
  • https://stackoverflow.com/questions/17408769/how-do-i-resolve-classnotfoundexception – Ijaz Ahmad Apr 04 '19 at 22:32
  • I started a local spark cluster without k8s, and executed the same Spark Submit with the master option set to `--master spark://127.0.1.1:7077 \` . In this case everything works fine, so it does not look like a classpath problem. (Or I am missing something ?) – Nakeuh Apr 05 '19 at 06:58
  • hmm , the psark environmenet/insallation/docker image is the same? – Ijaz Ahmad Apr 05 '19 at 08:35

1 Answers1

3

To make it work, use either client mode that distributes the jars (--deploy-mode client) or specify the path to the jar file in the container image. So instead of using

./examples/jars/spark-examples_2.11-2.4.0.jar, use something like: /opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar (depending on the image you use)

also check my spark operator for K8s: https://github.com/radanalyticsio/spark-operator :)

Jiri Kremser
  • 12,471
  • 7
  • 45
  • 72
  • Great ! I was so focus on the K8S part that I forgot that the problem may come from Spark :) I still have a little problem tho. When I submit with `--master spark://spark-kubernetes`, I get a `Invalid master URL`. The submit seems to only accept an URL with a port in the end. How can I get around this problem ? – Nakeuh Apr 08 '19 at 07:19
  • 1
    check the gif in that `spark-operator` repo. There I am creating NodePort service to expose the spark master, hopefully it will work for you too. – Jiri Kremser Apr 08 '19 at 12:09