13

I ran spark application and wanna pack the test classes into the fat jar. What is weird is I ran "sbt assembly" successfully, but failed when I ran "sbt test:assembly".

I tried sbt-assembly : including test classes, it didn't work for my case.

SBT version : 0.13.8

build.sbt:

import sbtassembly.AssemblyPlugin._

name := "assembly-test"

version := "1.0"

scalaVersion := "2.10.5"

libraryDependencies ++= Seq(
  ("org.apache.spark" % "spark-core_2.10" % "1.3.1" % Provided)
    .exclude("org.mortbay.jetty", "servlet-api").
    exclude("commons-beanutils", "commons-beanutils-core").
    exclude("commons-collections", "commons-collections").
    exclude("commons-logging", "commons-logging").
    exclude("com.esotericsoftware.minlog", "minlog").exclude("com.codahale.metrics", "metrics-core"),
  "org.json4s" % "json4s-jackson_2.10" % "3.2.10" % Provided,
  "com.google.inject" % "guice" % "4.0"
)

Project.inConfig(Test)(assemblySettings)
Community
  • 1
  • 1
Grant
  • 500
  • 1
  • 5
  • 18

2 Answers2

23

As an addition to Wesley Milano's answer, the code needs to be adapted a bit for the newer version (i.e. 0.13.0) of the sbt-assembly plugin, in case someone is wondering about deprecation warnings:

assemblyMergeStrategy in assembly := {
    case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
    case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
    case PathList("org", "apache", xs @ _*) => MergeStrategy.last
    case PathList("com", "google", xs @ _*) => MergeStrategy.last
    case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
    case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
    case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
    case "about.html" => MergeStrategy.rename
    case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
    case "META-INF/mailcap" => MergeStrategy.last
    case "META-INF/mimetypes.default" => MergeStrategy.last
    case "plugin.properties" => MergeStrategy.last
    case "log4j.properties" => MergeStrategy.last
    case x =>
        val oldStrategy = (assemblyMergeStrategy in assembly).value
        oldStrategy(x)
}
PermaFrost
  • 1,386
  • 12
  • 10
  • 11
    I'v been using Scala for more than a year and I have no idea what this bit of code code, but the important thing is that it works. Thanks – Felipe Feb 01 '16 at 02:28
  • @FelipeAlmeida You seemed to be experienced at spark so I was wondering if you can help me out a bit... I am trying to create a jar file form my SBT project to run it. Do you know how I can do that? –  Jul 13 '16 at 00:16
  • 1
    @1290 Sure. I've actually written a piece on this: http://queirozf.com/entries/creating-scala-fat-jars-for-spark-on-sbt-with-sbt-assembly-plugin – Felipe Jul 13 '16 at 15:29
  • spark 2.x.x requires a slight variation of this solution: http://queirozf.com/entries/creating-scala-fat-jars-for-spark-on-sbt-with-sbt-assembly-plugin#spark2 – ecoe Jan 18 '17 at 00:26
16

You will have to define mergeStratey in assembly, as what I did for my spark app below.

mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
  {
    case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
    case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
    case PathList("org", "apache", xs @ _*) => MergeStrategy.last
    case PathList("com", "google", xs @ _*) => MergeStrategy.last
    case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
    case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
    case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
    case "about.html" => MergeStrategy.rename
    case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
    case "META-INF/mailcap" => MergeStrategy.last
    case "META-INF/mimetypes.default" => MergeStrategy.last
    case "plugin.properties" => MergeStrategy.last
    case "log4j.properties" => MergeStrategy.last
    case x => old(x)
  }
}
Wesley Miao
  • 851
  • 5
  • 8
  • Put all these stuff in the sbt file and added more "exclude(...)" clauses, jar can be generated and test classes also are in the jar, however I found "provided" doesn't work – Grant May 26 '15 at 03:14
  • "provided" is only needed if you submit your spark app through spark-submit. If you run your spark app directly, don't use "provided". – Wesley Miao May 26 '15 at 05:57