14

I'm experimenting with using Shake to build Java code, and am a bit stuck because of the unusual nature of the javac compiler. In general for each module of a large project, the compiler is invoked with all of the source files for that module as input, and produces all of the output files in one pass. Subsequently we typically take the .class files produced by the compiler and assemble them into a JAR (basically just a ZIP).

For example, a typical Java module project is arranged as follows:

  • a src directory that contains multiple .java files, some of them nested many levels deep in a tree.
  • a bin directory that contains the output from the compiler. Typically this output follows the same directory structure and filenames, with .class substituted for each .java file, but the mapping is not necessarily one-to-one: a single .java file can produce zero to many .class files!

The rules I would like to define in Shake are therefore as follows:

1) If any file under src is newer than any file under bin then erase all contents of bin and recreate with:

javac -d bin <recursive list of .java files under src>

I know this rule seems excessive, but without invoking the compiler we cannot know the extent of changes in output resulting from even a small change in a single input file.

2) if any file under bin is newer than module.jar then recreate module.jar with:

jar cf module.jar -C bin .

Many thanks!

PS Responses in the vein "just use Ant/Maven/Gradle/" will not be appreciated! I know those tools offer Java compilation out-of-the-box, but they are much harder to compose and aggregate. This is why I want to experiment with a Haskell/Shake-based tool.

Neil Bartlett
  • 23,743
  • 4
  • 44
  • 77

1 Answers1

10

Writing rules which produce multiple outputs whose names cannot be statically determined can be a bit tricky. The usual approach is to find an output whose name is statically known and always need that, or if none exists, create a fake file to use as the static output (as per ghc-make, the .result file). In your case you have module.jar as the ultimate output, so I would write:

"module.jar" *> \out -> do
    javas <- getDirectoryFiles "" ["src//*.java"]
    need javas
    liftIO $ removeFiles "" ["bin//*"]
    liftIO $ createDirectory "bin"
    () <- cmd "javac -d bin" javas
    classes <- getDirectoryFiles "" ["bin//*.class"]
    need classes
    cmd "jar cf" [out] "-C bin ."

There is no advantage to splitting it up into two rules, since you never depend on the .class files (and can't really, since they are unpredictable in name), and if any source file changes then you will always rebuild module.jar anyway. This rule has all the dependencies you mention, plus if you add/rename/delete any .java or .class file then it will automatically recompile, as the getDirectoryFiles call is tracked.

Neil Mitchell
  • 9,090
  • 1
  • 27
  • 85
  • Great, thanks Neil! This basically works but a couple of small changes are required. For example we need to `need` the full path of the .java and .class files including the src/bin prefix. Also we need to create the `bin` dir explicitly before calling javac. Should I edit your answer? – Neil Bartlett Jun 25 '13 at 07:36
  • Yes please! I suspect if you change getDirectoryFiles to be "" ["src//*.java"] that would be easier than prepending src afterwards, although both should behave identically (and just as efficiently). For creating the directory you will need liftIO $ createDirectory - usually it isn't necessary as shake creates all directories for outputs it knows about, but here it doesn't know about bin. – Neil Mitchell Jun 25 '13 at 09:22
  • I am doing something similar, but I seem to get into "thread blocked indefinitely in an MVar operation" when I 'need' dynamically generated FilePaths. – user239558 Jul 04 '13 at 16:47
  • @user239558: Perhaps you could email a small example to the mailing list? https://groups.google.com/forum/?fromgroups#!forum/shake-build-system, or raise a stack-overflow ticket if that suits more. – Neil Mitchell Jul 04 '13 at 17:03
  • Sorry, I found the issue, and it's me :-) – user239558 Jul 04 '13 at 17:32
  • 1
    @user239558 I'd still be interested in the example, in case I can give a better error message - unless they are your own MVar's you are using directly, and then I'd be curious if you can avoid them by using something built in and safely wrapped. – Neil Mitchell Jul 05 '13 at 14:24
  • Hi @NeilMitchell - in https://github.com/ndmitchell/shake/issues/631#issuecomment-443520912 you mention that getDirectoryFiles shouldn't be called on generated files like you do above. I know it's been a few years - is there a new preferred way of solving this problem? – mavnn Apr 26 '19 at 15:57
  • @mavnn The above answer remains the best on I have. – Neil Mitchell Apr 28 '19 at 20:35
  • 1
    @NeilMitchell, I'm afraid you are not quite right with "never depend on the .class files". 1) The files are useful for running unit tests. The tests can be run without building a jar 2) javac compiler is incremental, so it can reuse the previously generated class files. If you remove them, it forces excessive re-compilations. – Vladimir Sitnikov Jun 13 '20 at 06:56
  • @VladimirSitnikov thanks for the info. Good to know. I imagine with a bit more complexity you could capture that too, although it wouldn't be as simple. – Neil Mitchell Aug 30 '21 at 13:58