Unable to install spark 2.2 in Cloudera Quickstart VM (5.10)

Question

I have followed the blog (Below mentioned) here and downloaded the parcel and put as per required. Please let me know if any one has installed and the steps.

(https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html)

/opt/cloudera/csd/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658-el5.parcel

But service cloudera-scm-server restart is not executing. To use Cloudera Express (free), run:

sudo /home/cloudera/cloudera-manager --express

This requires at least 8 GB of RAM and at least 2 virtual CPUs.

I had many issues when using the Cloudera Express. I think you have to wait until they get the 2.2 added to the VM. I just now could not get it to work. Best on an own clean machine, but many issues then to consider. — thebluephantom, Jun 16 '18 at 11:50
I added it and then got issues with the hive metastore. I suspect that it has to do with 1.6 being there. Hopeless — thebluephantom, Jul 29 '18 at 20:52
I have installed mapR Sandbox that has SPARK 2.2 but not IMPALA. Intefrace is nicer and it seems to work with little effort. That's my advice. — thebluephantom, Aug 11 '18 at 09:11

score 2 · Answer 1 · answered Nov 02 '18 at 07:21

SPARK 2.2 Installation Setup on Cloudera VM

Step 1: Download a quickstart_vm from the link:
Prefer a vmware platform as it is easy to use, anyways all the options are viable.
Size is around 5.4gb of the entire tar file. We need to provide the business email id as it won’t accept personal email ids. 


Step 2: The virtual environment requires around 8gb of RAM, please allocate sufficient memory to avoid performance glitches.


Step 3: Please open the terminal and switch to root user as:
         su root
         password: cloudera

Step 4: Cloudera provides java –version 1.7.0_67 which is old and does not match with our needs. To avoid java related exceptions, please install java with the following commands:
(a). Downloading Java:
wget -c --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz

(b). Switch to /usr/java/ directory with “cd /usr/java/” command.

(c). cp the java download tar file to the /usr/java/ directory.

(d). Untar the directory with “tar –zxvf jdk-8u31-linux-x64.tar.gz”

(e). Open the profile file with the command “vi ~/.bash_profile” 

(f). export JAVA_HOME to the new java directory.
       “export JAVA_HOME=/usr/java/jdk1.8.0_131”

       Save and Exit.


(g). In order to reflect the above change, following command needs to be executed on the shell:
       source ~/.bash_profile

Step 5:  The Cloudera VM provides spark 1.6 version by default. However, 1.6 API’s are old and do not match with production environments. In that case, we need to download and manually install Spark 2.2.

(a). Switch to /opt/  directory with the command:
“cd /opt/”

(b). Download spark with the command:
wget https://d3kbcqa49mib13.cloudfront.net/spark-2.2.0-bin-hadoop2.7.tgz

(c). Untar the spark tar with the following command:
tar -zxvf spark-2.2.0-bin-hadoop2.7.tgz

(d). We need to define some environment variables as default settings:
Please open a file with the following command:
vi /opt/spark-2.2.0-bin-hadoop2.7/conf/spark-env.sh
Paste the following configurations in the file:
SPARK_MASTER_IP=192.168.50.1
SPARK_EXECUTOR_MEMORY=512m
SPARK_DRIVER_MEMORY=512m
SPARK_WORKER_MEMORY=512m
SPARK_DAEMON_MEMORY=512m
Save and exit

(e).    We need to start spark with the following command:
/opt/spark-2.2.0-bin-hadoop2.7/sbin/start-all.sh
Export spark_home : 
export SPARK_HOME=/opt/spark-2.2.0-bin-hadoop2.7/

(f). Change the permissions of the directory:
chmod 777 -R /tmp/hive

(g). Try “spark-shell”, it should work.

This a dupe of swapnil's answer here: https://stackoverflow.com/questions/53149077/cloudera-quick-start-vm-lacks-spark-2-0-or-greater. The other one is better formatted. — Concrete Gannet, Apr 13 '20 at 07:47

score 1 · Answer 2 · answered Mar 18 '19 at 06:46

Please follow below video it has all the necessary step required in order to install Sprak2 in Clouedra VM.

youtubue link - https://www.youtube.com/watch?v=lQxlO3coMxM

Also for for starting Cloudera Express (free) your VM should have at-least 8Gb RAM allocated or if you have default 4GB RAM allocated then you can forcefullly start ysing below command and then follow the above video.

sudo /home/cloudera/cloudera-manager --force --express

score 0 · Answer 3 · edited Oct 20 '18 at 15:16

0

Try this command

sudo /home/cloudera/cloudera-manager --express --force

edited Oct 20 '18 at 15:16

Hossein Golshani

1,847
5
16
27

answered Oct 20 '18 at 14:59

Priyanka Gupta

29
1
8

could be nice to explain – OznOg Oct 20 '18 at 16:34
It gives a warning to run Cloudera Express in a VM with at least 8 gb of RAM, by adding --force, we are telling it to run Cloudera Express even with the current RAM. – Priyanka Gupta Oct 21 '18 at 08:16

thebluephantom · Answer 4 · 2018-11-02T15:03:45.427

I gave up on this, nothing works well with parcel and non-parcel installation.

As soon as cloudera express is started numerous errors and Java 7 instead of Java 8.

I got a mapr VM install with Spark 2.x. No issues. Works first time.

That works well. This is my advice # 1.

If you want KUDU, then I would install centos and install things oneself. This is advice # 2. OK, you may miss Impala, but if for pure research and development then not so much of an issue.

score 0 · Answer 5 · answered Aug 31 '19 at 07:24

With following two command my spark2.2 was automatically updated to spark 2.4:

(i) sudo yum update

It might be that your java, home path is screwed, in that case please export the java home path in bash file.

(a) vi ~/.bash_profile (b) (c) source ~/.bash_profile

pramod kumar · Answer 6 · 2019-12-31T05:02:38.143

0

Just download the right version of spark that you need say 'spark-2.2.0-bin-hadoop2.6'

open bashrc_profile through vi editor vi ~/.bash_profile. Paste the below 2 lines

SPARK_HOME=/home/cloudera/Downloads/spark-2.2.0-bin-hadoop2.6 PATH=$PATH:$HOME/bin:$SPARK_HOME/bin

Save it Then run the command : source ~/.bash_profile

Now start spark-shell . Note : Make sure you have JDK 1.8 installed

edited Dec 31 '19 at 05:02

answered Dec 29 '19 at 12:36

pramod kumar

1
2

score -1 · Answer 7 · answered Aug 03 '19 at 12:52

  SnPARK 2.2 Installation Setup on Cloudera VM

    Step 1: Download a quickstart_vm from the link:
    Prefer a vmware platform as it is easy to use, anyways all the options are viable.
    Size is around 5.4gb of the entire tar file. We need to provide the business email id as it won’t accept personal email ids. 


    Step 2: The virtual environment requires around 8gb of RAM, please allocate sufficient memory to avoid performance glitches.


    Step 3: Please open the terminal and switch to root user as:
             su root
             password: cloudera

    Step 4: Cloudera provides java –version 1.7.0_67 which is old and does not match with our needs. To avoid java related exceptions, please install java with the following commands:
    (a). Downloading Java:
    wget -c --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz

    (b). Switch to /usr/java/ directory with “cd /usr/java/” command.

    (c). cp the java download tar file to the /usr/java/ directory.

    (d). Untar the directory with “tar –xvzf jdk-8u31-linux-x64.tar.gz”

    (e). Open the profile file with the command “vi ~/.bash_profile” 

    (f). export JAVA_HOME to the new java directory.
           “export JAVA_HOME=/usr/java/jdk1.8.0_131”

           Save and Exit.


    (g). In order to reflect the above change, following command needs to be executed on the shell:
           source ~/.bash_profile

    Step 5:  The Cloudera VM provides spark 1.6 version by default. However, 1.6 API’s are old and do not match with production environments. In that case, we need to download and manually install Spark 2.2.

    (a). Switch to /opt/  directory with the command:
    “cd /opt/”

    (b). Download spark with the command:
    wget https://d3kbcqa49mib13.cloudfront.net/spark-2.2.0-bin-hadoop2.7.tgz

    (c). Untar the spark tar with the following command:
    tar -xvzf spark-2.2.0-bin-hadoop2.7.tgz

    (d). We need to define some environment variables as default settings:
    Please open a file with the following command:
    vi /opt/spark-2.2.0-bin-hadoop2.7/conf/spark-env.sh
    Paste the following configurations in the file:
    SPARK_MASTER_IP=192.168.50.1
    SPARK_EXECUTOR_MEMORY=512m
    SPARK_DRIVER_MEMORY=512m
    SPARK_WORKER_MEMORY=512m
    SPARK_DAEMON_MEMORY=512m
    SPARK_LOCAL_IP=127.0.0.1
    Save and exit

    (e).    We need to start spark with the following command:
    /opt/spark-2.2.0-bin-hadoop2.7/sbin/start-all.sh
    Export spark_home : 
    export SPARK_HOME=/opt/spark-2.2.0-bin-hadoop2.7/

    (f). Change the permissions of the directory:
    chmod 777 -R /tmp/hive

    (g). Try “spark-shell”, it should work.

Same answeras swapnil shashank with small modification below

SPARK_LOCAL_IP=127.0.0.1
tar -xvzf spark-2.2.0-bin-hadoop2.7.tgz

Unable to install spark 2.2 in Cloudera Quickstart VM (5.10)

7 Answers7