I have deployed an Amazon EC2 cluster with Spark like so:
~/spark-ec2 -k spark -i ~/.ssh/spark.pem -s 2 --region=eu-west-1 --spark-version=1.3.1 launch spark-cluster
I copy a file I need first to the master and then from master to HDFS using:
ephemeral-hdfs/bin/hadoop fs -put ~/ANTICOR_2_10000.txt ~/user/root/ANTICOR_2_10000.txt
I have a jar I want to run which was compiled with JDK 8 (I am using a lot of Java 8 features) so I copy it over with scp
and run it with:
spark/bin/spark-submit --master spark://public_dns_with_port --class package.name.to.Main job.jar -f hdfs://public_dns:~/ANTICOR_2_10000.txt
The problem is that spark-ec2
loads the cluster with JDK7 so I am getting the Unsupported major.minor version 52.0
My question is, which are all the places where I need to change JDK7 to JDK8?
The steps I am doing thus far on master are:
- Install JDK8 with yum
- Use
sudo alternatives --config java
and change prefered java to java-8 export JAVA_HOME=/usr/lib/jvm/openjdk-8
Do I have to do that for all the nodes? Also do I need to change the java path that hadoop uses at ephemeral-hdfs/conf/hadoop-env.sh
or are there any other spots I missed?