31

In SBT is the use of aggregate following dependsOn redundant if they both contain the same sub-modules? According to the documentation it seems so, but I have seen this behavior used before and I don't understand what the benefit is. If a project is defined with dependencies, doesn't that already imply what aggregate does for those same dependencies? I notice that my project build is much slower with the use of this redundant aggregate than without and I'd like to know if I can safely remove it.

lazy val module = sbt.Project(...) dependsOn (foo, bar) aggregate (foo, bar)

OR just...

lazy val module = sbt.Project(...) dependsOn (foo, bar)

I am using SBT 0.13.6

Nathaniel Ford
  • 20,545
  • 20
  • 91
  • 102
Coder Guy
  • 1,843
  • 1
  • 15
  • 21

1 Answers1

32

tl;dr aggregate causes the tasks to be executed in the aggregating module and all aggregated one while dependsOn sets a CLASSPATH dependency so the libraries are visible to the aggregateing module (depending on the configuration that's compile aka default in the example).

A sample to demonstrate the differences.

I'm using the following build.sbt (nothing really interesting):

lazy val a = project

lazy val b = project

lazy val c = project dependsOn b aggregate (a,b)

The build defines three modules a, b, and c with the last c project to be an aggregate for a and b. There's the fourth module - an implicit one - that aggregates all the modules a, b, and c.

> projects
[info] In file:/Users/jacek/sandbox/aggregate-dependsOn/
[info]     a
[info]   * aggregate-dependson
[info]     b
[info]     c

When I execute a task in an aggreateing module, the task is going to be executed in the aggregated modules.

> compile
[info] Updating {file:/Users/jacek/sandbox/aggregate-dependsOn/}b...
[info] Updating {file:/Users/jacek/sandbox/aggregate-dependsOn/}a...
[info] Updating {file:/Users/jacek/sandbox/aggregate-dependsOn/}aggregate-dependson...
[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Updating {file:/Users/jacek/sandbox/aggregate-dependsOn/}c...
[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[success] Total time: 0 s, completed Oct 22, 2014 9:33:20 AM

The same happens when I execute a task in c that will in turn execute it against a and b, but not in the top-level project.

> show c/clean
[info] a/*:clean
[info]  ()
[info] b/*:clean
[info]  ()
[info] c/*:clean
[info]  ()
[success] Total time: 0 s, completed Oct 22, 2014 9:34:26 AM

When a task's executed in a or b, it runs only within the project.

> show a/clean
[info] ()
[success] Total time: 0 s, completed Oct 22, 2014 9:34:43 AM

Whether or not a task is executed in aggregateing projects is controlled by aggregate key scoped to a project and/or task.

> show aggregate
[info] a/*:aggregate
[info]  true
[info] b/*:aggregate
[info]  true
[info] c/*:aggregate
[info]  true
[info] aggregate-dependson/*:aggregate
[info]  true

Change it as described in Aggregation:

In the project doing the aggregating, the root project in this case, you can control aggregation per-task. (...) aggregate in update is the aggregate key scoped to the update task.

Below I'm changing the key for c module and clean task so clean is no longer executed in aggregated modules a and b:

> set aggregate in (c, clean) := false
[info] Defining c/*:clean::aggregate
[info] The new value will be used by no settings or tasks.
[info] Reapplying settings...
[info] Set current project to aggregate-dependson (in build file:/Users/jacek/sandbox/aggregate-dependsOn/)
> show c/clean
[info] ()
[success] Total time: 0 s, completed Oct 22, 2014 9:39:13 AM

The other tasks for c are unaffected and still executing a task in c will run it in the aggregate modules:

> show c/libraryDependencies
[info] a/*:libraryDependencies
[info]  List(org.scala-lang:scala-library:2.10.4)
[info] b/*:libraryDependencies
[info]  List(org.scala-lang:scala-library:2.10.4)
[info] c/*:libraryDependencies
[info]  List(org.scala-lang:scala-library:2.10.4)

While aggregate sets a dependency for sbt tasks so they get executed in the other aggregated modules, dependsOn sets a CLASSPATH dependency, i.e. a code in dependsOned module is visible in the dependsOning one (sorry for the "new" words).

Let's assume b has a main object as follows:

object Hello extends App {
  println("Hello from B")
}

Save the Hello object to b/hello.scala, i.e. under b module.

Since c was defined to dependsOn b (see build.sbt above), the Hello object is visible in b (because it belongs to the module), but also in c.

> b/run
[info] Running Hello
Hello from B
[success] Total time: 0 s, completed Oct 22, 2014 9:46:44 AM
> c/runMain Hello
[info] Running Hello
Hello from B
[success] Total time: 0 s, completed Oct 22, 2014 9:46:58 AM

(I had to use runMain in c as run alone couldn't see the class that I can't explain).

Trying to run the task in a ends up with java.lang.ClassNotFoundException: Hello since the class is not visible in the module.

> a/runMain Hello
[info] Updating {file:/Users/jacek/sandbox/aggregate-dependsOn/}a...
[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Running Hello
[error] (run-main-6) java.lang.ClassNotFoundException: Hello
java.lang.ClassNotFoundException: Hello
    at java.lang.ClassLoader.findClass(ClassLoader.java:530)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
[trace] Stack trace suppressed: run last a/compile:runMain for the full output.
java.lang.RuntimeException: Nonzero exit code: 1
    at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last a/compile:runMain for the full output.
[error] (a/compile:runMain) Nonzero exit code: 1
[error] Total time: 0 s, completed Oct 22, 2014 9:48:15 AM

Redefine a to dependsOn b in build.sbt and the exception vanishes.

You should read Multi-project builds in the official documentation.

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
  • 4
    According to the doc for 'dependsOn': "This also creates an ordering between the projects when compiling them; util must be updated and compiled before core can be compiled." does that mean a dependsOn also implies an aggregate for the compile and update tasks for the modules in question? or must this be expressed separately using 'aggregate'? Or is the difference in behavior between aggregate and dependsOn the ordering of task execution rather than the selection of task execution? – Coder Guy Oct 22 '14 at 20:37
  • 5
    A very interesting observation - `dependsOn` does set a kind of `aggregate` on dependent projects, but only for the tasks that make their jars available so they can be available on CLASSPATH. It's the case of `compile`, but not `test` that would only execute `compile` if it was not done before. – Jacek Laskowski Oct 22 '14 at 21:25