1

I have a cluster setup using cdh5.9.0. The default Spark service package which cloudera ships is 1.6.0. I need to upgrade the same to 1.6.3 due to the distributed cache issue which was resolved in the following git commit: https://github.com/RicoGit/spark/commit/e5f1d9c8f9c94615322aaf7508e753307f553d53

If I could know the neat ways to upgrade the spark service deployed on cloudera. Also, in extension to this, how to upgrade to Spark 2.0 as well for the same cluster.

Thank you.

taransaini43
  • 184
  • 6
  • 17
  • run `cat /etc/centos-release` , give me the output of console. – Abdennour TOUMI Dec 06 '16 at 11:01
  • the underlying os is not centos but ubuntu, ubuntry trusty -14.04 – taransaini43 Dec 06 '16 at 11:07
  • Possible duplicate of [How to upgrade Spark to newer version?](http://stackoverflow.com/questions/33887227/how-to-upgrade-spark-to-newer-version) – Abdennour TOUMI Dec 06 '16 at 13:26
  • Check this http://blog.cloudera.com/blog/2016/09/apache-spark-2-0-beta-now-available-for-cdh/ – BruceWayne Dec 07 '16 at 05:32
  • "Apache Spark 2.0 (Beta) can only be installed on CDH 5.7 or CDH 5.8 clusters, and it requires a minimum CM version of 5.8." I have a cluster setup using cdh5.9.0. Many other services are installed on top of it it so not tweaking that. – taransaini43 Dec 07 '16 at 05:46
  • Did you find answer for this? i have a similar question in case you can help me, im researching if i can upgrade from spark 1.1 to spark 1.6 on CDH 5.2 – Luis Leal Apr 05 '17 at 23:11

2 Answers2

2

Recently Cloudera released Spark 2.0 parcels, you download from spark archive

Follow the link for the installation procedure

Note: Apache Spark 2.0 can only be installed on CDH 5.7, CDH 5.8, or CDH 5.9 clusters, and requires a minimum CM version of 5.8.3, 5.9 or higher

BruceWayne
  • 3,286
  • 4
  • 25
  • 35
0

Just follow the steps here:

https://gist.github.com/shredder47/ce2f158a2a3907c0d264c5e9e4aab2fa

Or

java -version
sudo yum remove java
sudo yum install java-1.8.0-openjdk
source ~/.bash_profile

Download Spark 2.4.7 With Hadoop 2.6 (Tar)
Extract contents.
Move the contents of the folder to :

/usr/local/spark

Now,
Open:

/usr/bin/pyspark
/usr/bin/spark-shell
/usr/bin/spark-submit


and change the value for each files to 

'exec /usr/local/spark/bin/pyspark "$@"'
'exec /usr/local/spark/bin/spark-shell "$@"'
'exec /usr/local/spark/bin/spark-submit "$@"'

Now try running spark to check the version
Abhishek Sengupta
  • 2,938
  • 1
  • 28
  • 35