Changing JDK on cluster deployed with ./spark-ec2

Question

I have deployed an Amazon EC2 cluster with Spark like so:

~/spark-ec2 -k spark -i ~/.ssh/spark.pem -s 2 --region=eu-west-1 --spark-version=1.3.1 launch spark-cluster

I copy a file I need first to the master and then from master to HDFS using:

ephemeral-hdfs/bin/hadoop fs -put ~/ANTICOR_2_10000.txt ~/user/root/ANTICOR_2_10000.txt

I have a jar I want to run which was compiled with JDK 8 (I am using a lot of Java 8 features) so I copy it over with scp and run it with:

spark/bin/spark-submit --master spark://public_dns_with_port --class package.name.to.Main job.jar -f hdfs://public_dns:~/ANTICOR_2_10000.txt

The problem is that spark-ec2 loads the cluster with JDK7 so I am getting the Unsupported major.minor version 52.0

My question is, which are all the places where I need to change JDK7 to JDK8?

The steps I am doing thus far on master are:

Install JDK8 with yum
Use sudo alternatives --config java and change prefered java to java-8
export JAVA_HOME=/usr/lib/jvm/openjdk-8

Do I have to do that for all the nodes? Also do I need to change the java path that hadoop uses at ephemeral-hdfs/conf/hadoop-env.sh or are there any other spots I missed?

@M-T-A no I did not find one. The most efficient thing that I found was using Terminator (https://launchpad.net/terminator) to ssh in all the slaves and manually change the JDK in all of them at once. — Aki K, Jun 16 '16 at 12:54
For future reference, `pssh -i -h /root/spark-ec2/slaves ` would definitely run this command on all slaves. — Mohamed Taher Alrefaie, Jun 18 '16 at 16:15

score 0 · Answer 1 · edited May 23 '17 at 12:08

0

Unfortunately, Amazon doesn't offer out-of-the-box Java 8 installations, yet: see available versions.

Have you seen this post on how to install it on running instances?

edited May 23 '17 at 12:08

Community

1
1

answered Dec 28 '15 at 23:28

Jonathan

358
3
14

score 0 · Answer 2 · answered Dec 29 '15 at 01:07

Here is what i have been doing for all java installations which are different from versions provided by default installations: -

Configure the JAVA_HOME environment variable on each machine/ node: -

export JAVA_HOME=/home/ec2-user/softwares/jdk1.7.0_25
Modify the default PATH and place the "java/bin" directory before the rest of the PATH on all Nodes/ machines.

export PATH=/home/ec2-user/softwares/jdk1.7.0_25/bin/:$M2:$SCALA_HOME/bin/:$HIVE_HOME/bin/:$PATH:

And the above needs to be done with the same "OS user" which is used to execute/ own the spark master/ worker process.

Changing JDK on cluster deployed with ./spark-ec2

2 Answers2