Java developer needs help understanding the .NET build process in F#

Question

Most of my development has been in Java, where I am used to having a runtime, a compiler, and a build tool. So now I'm trying to come into the .NET world, specifically using VSCode, Ionide plugin, and F#, to build an F# program. I'm having a hard time understanding the direct comparisons with the Java build process. This is my rough understanding so far:

JRE -> .NET Runtime?
JDK tools -> Microsoft Build Tools 2015 which includes the F# compiler and other tools?
Paket -> maven?
FAKE -> maven's <build> section in pom.xml?
paket.dependencies and paket.references -> maven's <dependencies> section in pom.xml?
*.*proj file -> ???

I'm really confused about the *proj file. I thought that was related to MSBuild. But I'm confused because I thought FAKE was a replacment for MSBuild, but some examples of FAKE I've seen reference this file and pass it to and MSBuildRelease task.

Also, why does paket need a dependencies and references file?

I was hoping someone could confirm, clarify, add, or any of the above, to my level of understanding so far. Much appreciated!

Edit:

I know this question is convoluted and not very specific. Thank you all for taking the time to weed through it and answer what you're able. I appreciate it.

score 6 · Answer 1 · answered Mar 26 '17 at 01:14

Since nobody has yet answered the Paket section of your question, I'll tackle that one.

In one Git repository, you might have multiple projects. For example, it's a very common pattern (so common that the ProjectScaffold repo is set up that way because it's what most people want) to have a separate project for your main code and for your tests. In your MyApp.fsproj file, you might not want to have a reference to NUnit or XUnit, but you do want those references in your MyApp.Tests.fsproj file.

The paket.dependencies file is one-per-repository, and it lists all the packages that any project in the repository wants to use. The paket.references files, plural, are one-per-project, and go in the same directory as the .fsproj files they correspond to. They list the packages that that project wants to reference. So in the paket.references file in the same directory as MyApp.Tests.fsproj, you would list NUnit or XUnit — but you would not list the unit-testing libraries in the paket.references file that's in the same directory as MyApp.fsproj.

If you only had one project in your Git repo, then there wouldn't be a need for separate paket.dependencies and paket.references files, as they could be combined into a single file that served both purposes. But as soon as you have multiple .fsproj files in a single repository, then the separation between dependencies and references becomes useful, as you can have all your dependencies listed in a single paket.dependencies file in your repo root, but give each project its own paket.references file so that it can reference only the subset of dependencies that it specifically needs.

Your explanation of separating out unit tests into their own project helps me make sense of this. In java, we usually separate out our unit tests under a different folder than our code, and tell the build tool where our tests are located. Further, Instead of different .references files to list dependencies for different compilation units, maven and gradle have 'scoped' dependencies, where you are able to tag your dependencies as being part of 'tests', 'compilation', 'runtime-only', etc. Just a different way or organizing things, I guess. Thank for your answer! — jmrah, Mar 26 '17 at 18:36
We also put tests in a different folder. And we also have "scoped" dependencies. But scope is not enough. We want to specify dependencies for each compilation unit individually. — Fyodor Soikin, Mar 26 '17 at 19:56

score 5 · Answer 2 · edited May 23 '17 at 10:30

Let me try to tackle these one by one...

Q: What are the *proj files for?

The *proj files are the native language of MSBuild (or xbuild if you're on Mono). In the simplest case, they just list all the files to be compiled and all references to other projects that they use. And a few other properties, like target platform, CPU architecture, etc. The original idea was that one *proj file produces one compilation assembly unit (aka "DLL", which roughly corresponds to JAR). This is still mostly true, but not always. For example, TypeScript projects can produce multiple JS files.

But the core architecture of MSBuild makes these files very flexible. They are basically built on a system of plugins (called "Tasks"), where each Task does some specific thing. A number of tasks come in the box, such as "compile C# files" or "output to log", but you can also add your own, or install some in the form of packages, or globally. Plus, the core of MSBuild allows for some tricks, like variables (kind of), loops (sort of), and branching (to an approximation), to the point where you can actually write kind of real programs entirely in a *proj file. The original idea behind all this was that MSBuild would become the primary vehicle for the whole build process, no other tools required. And for simple toy projects it kind of is: when all you need is just compile a bunch of files to a DLL, MSBuild does the job marvelously. And even with not-so-simple projects, people have tried to do the whole build entirely in *proj files. There are some projects that do this even now.

However, over time it became clear that writing high-level build logic in XML is so tricky and backwards that it becomes unmaintainable very fast. And so a bunch of tools appeared specifically for implementing the high-level build logic. Which brings me to...

Q: Isn't FAKE a replacement for MSBuild?

Yes, it is. Well, no, actually it isn't. Well, maybe. Kind of.

As I said above, it is possible to write the whole build logic in MSBuild: compile, copy, compress (where needed), bundle (if required), package (where appropriate), and publish (where legal :-). But it is extremely awkward to write and quickly becomes unmaintainable.

Enter FAKE: in FAKE, you write the whole logic of the build process entirely in F#. This means you can use libraries, abstractions, higher-order functions, special types - all the goodness of a real high-level language. You can specify the list of source files with glob patterns, call the F# compiler with FscHelper, run tests with XUnitHelper, package and publish with Paket helper, deploy to Azure, and even notify your teammates on Slack - all without leaving the comfort of the functional goodness.

There is one problem though: the build script is not analyzable by tools.
Since it's a real program, and not a data format, an IDE wouldn't know how to tease the list of source files out of it, or how to add new ones; package manager wouldn't know where to insert references to packages, and so on.
The MSBuild files, on the other hand, are the definition of analyzability: they're XML, and there are strong standards in place about what goes where.

And so it became: we use MSBuild to specify some strict, yet simple properties, and then use a FAKE script to "drive" the higher-level build logic - such as running tests, generating docs, publishing, deploying, etc. And instead of specifying a list of files and calling F# compiler directly, our build script just calls MSBuild.

Q: Why does Paket need both references and dependencies?

In short: because we usually deal with more than one compilation unit (aka "Project") at once (we call such system of projects a "Solution"), and we don't like it when different projects under one solution use different versions of the same packages: conflicts abound, and our heads explode. So we specify one list of packages for the whole solution (in "dependencies"), and then every project can pick the ones that it actually uses (in "references").

In long: @rmunn's excellent answer discusses this in a bit more detail.

Also of interest: the "native" .NET package manager NuGet doesn't actually have this property. Every project under NuGet has its own, completely independent list of packages. And this head-exploding version conflicts do actually happen. On large solutions, depressingly often. This is one of the many reasons why Paket is superior.

So, you're saying that the reason that there is a .dependencies file *and* a .references file is so each project is guaranteed to use the same version of a dependency as other projects in the solution? Why is this important in .NET land? Do different versions of dependencies overwrite each other when downloaded or something? It doesn't seem to be a problem with Java's build tools. I mean, I certainly *do not* want my head to explode :), but are you able to elaborate on this point at all? BTW, great answer. Thank you for taking the time. — jmrah, Mar 26 '17 at 18:59
My understanding is that the same issues do exist in Java but just like in .NET default conflict resolution may alleviate the majority of these issues until it does and then you have to do deal with it. I've heard some discussion on how changes for modules in Java 9+ aim to address this same problem. — jpierson, Jul 05 '18 at 02:47

score 0 · Answer 3 · answered Mar 26 '17 at 00:35

I am on a Windows platform. The F-Sharp compiler that came with my (old) version is the Microsoft one, and is called Fsc.exe. I use it like this: Fsc.exe HelloWorld.fs.

I recently started using Fake. In my Fake build script there is a command, MSBuild, that calls the MSBuild which in turn calls Fsc.exe. So in a sense, all that extra packaging is just there so that I need to type less (the argument string to Fsc easily ends up being a bit long, and it is nice to have tools to make sure the arguments are 'correct').

The .fsproj file is a file with the information that MSBuild use to send to Fsc. The file can also be read by e.g. Visual Studio so that your files appear as one project in your (Visual Studio) editor.

Java developer needs help understanding the .NET build process in F#

Edit:

3 Answers3

Q: What are the *proj files for?

Q: Isn't FAKE a replacement for MSBuild?

Q: Why does Paket need both references and dependencies?