9

I have been trying to install spark using the tutorial and everytime I run the command sbt/sbt assembly, I get the error "Error: Invalid or corrupt jarfile sbt/sbt-launch-0.13.5.jar"

I have tried everything: seperately adding the sbt file to the sbt folder in the spark folder, installing sbt individually, checking the download and reinstalling it over again, but in vain. Any advice about what I am doing wrong? Thanks.

user2330778
  • 235
  • 1
  • 4
  • 11
  • I am also running into this issue. I will note that I am using Ubuntu 15.04 and have java version "1.7.0_80" and Scala code runner version 2.10.4 – Frozenfire Jul 23 '15 at 19:18

5 Answers5

32

Ok, After playing around for a while I finally got it and hopefully this will work for you aswell. That tutorial builds spark, where they do provide prebuilt binaries. I'm using Spark 1.2.0 just as a note (1.4.1 wouldn't work for me)

This is on Ubuntu 15.04 but should work on 14.04 the same

1) Remove the following lines from your bashrc

export SCALA_HOME=/usr/local/src/scala/scala-2.10.4
export PATH=$SCALA_HOME/bin:$PATH

2) Remove and reinstall scala

sudo rm -rf /usr/local/src/scala
# The following line is only needed if you installed scala another way, if so remove the #
# sudo apt-get remove scala-library scala
wget http://www.scala-lang.org/files/archive/scala-2.11.7.deb
sudo dpkg -i scala-2.11.7.deb
sudo apt-get update
sudo apt-get install scala

3) Download PreBuilt Spark and extract

wget http://d3kbcqa49mib13.cloudfront.net/spark-1.2.0-bin-hadoop2.4.tgz
tar -xzvf spark-1.2.0-bin-hadoop2.4.tgz 

4) Run spark-shell

cd spark-1.2.0-bin-hadoop2.4/
./bin/spark-shell

Sources (basically where I've read from, this solution has been trial and error)

https://chongyaorobin.wordpress.com/2015/07/01/step-by-step-of-installing-apache-spark-on-apache-hadoop/
https://gist.github.com/visenger/5496675

Frozenfire
  • 677
  • 7
  • 15
2

If you have downloaded spark package from http://d3kbcqa49mib13.cloudfront.net/spark-1.1.0.tgz then cross check file - "sbt/sbt-launch-0.13.5.jar". If it just contains small (5-6lines) html content then you need to download jar file manually. This html file just indicate that required jar file was not found. You may use follow following steps for centos:

  1. Download jar manually:
    wget http://dl.bintray.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/0.13.1/sbt-launch.jar ./sbt/sbt-launch-0.13.5.jar
  2. Prevent automatic downloading of jar file:
    sed -i '47,68s/^/#/' sbt/sbt-launch-lib.bash
  3. Install spark again:
    sbt/sbt assembly

It worked for me without altering scala installation. Hope it helps.

1

The sbt script does not download properly sbt-launch-0.13.5.jar because there must be something wrong with the URLs it is using. As a result the file that it downloads contains just an HTML header (wither 400 or 302 codes). Until a better solution becomes available, as a workaround I would download manually sbt-launch-0.13.5.jar beforehand.

0

In the SPARK_HOME/sbt/sbt-launch-lib.bash script replace line 53 to line 57 with following

if hash curl 2>/dev/null; then
  (curl --fail --location --silent ${URL1} > ${JAR_DL} ||\
   (rm -f "${JAR_DL}" && curl --fail --location --silent ${URL2} > ${JAR_DL})) && \
   mv "${JAR_DL}" "${JAR}"
elif hash wget 2>/dev/null; then
  (wget --quiet ${URL1} -O ${JAR_DL} ||\
   (rm -f "${JAR_DL}" && wget --quiet ${URL2} -O ${JAR_DL})) &&\
   mv "${JAR_DL}" "${JAR}"
else

Then try again, run the sbt assembly command

sbt/sbt assembly

Simplest method is install sbt manually as follows

download sbt deb file

wget http://dl.bintray.com/sbt/debian/sbt-0.13.5.deb

Then run

sudo dpkg -i sbt-0.13.5.deb
sudo apt-get update
sudo apt-get install sbt

then build using sbt assembly instead of sbt/sbt assembly from spark home folder

prabeesh
  • 935
  • 9
  • 11
0

@Frozenfire, I'am not sure if it's possible but the Spark documentation Overview says :

For the Scala API, Spark 1.4.1 uses Scala 2.10. You will need to use a compatible Scala version (2.10.x).

And I wonder if it would be the reason why you have this problem:

I'm using Spark 1.2.0 just as a note (1.4.1 wouldn't work for me)

Because you do :

sudo dpkg -i scala-2.11.7.deb

which downloads and installs scala-2.11.7.

I don't know but this might be a clue !

PS1: this is more a comment to Frozenfire's answer, but I can't comment because of a lack of reputation and I wanted to share this.

PS2: Building for Scala 2.11

Pierre Cordier
  • 494
  • 3
  • 17