distcc - are there cases it requires a synchronized network filesystem

Question

Two simplified makefiles

makefile1

a.txt:
    echo “123144234” > a.txt

t2: a.txt
    cat a.txt > b.txt

makefile2

t1:
    echo “123144234” > a.txt

t2: t1
    cat a.txt > b.txt

Both makefiles have the same functionality.

Both makefiles can be run in parallel because the dependency of t2 on t1.

However, there is a critical difference which might?/does? make a difference when it comes to distributed builds.

In makefile1, t2 depends directly on the artifact a.txt which is also the same as the name of the target itself a.txt. However, in makefile2, while the recipe and artifact of t1 is same as for a.txt, the name of the target is not a.txt.

This difference is key because gnu make (and I assume distcc) does NOT parse the recipe - nor analyze the filesystem at runtime - to determine all the artifacts for a given target. In makefile2, gnu make does NOT create ANY relationship between a.txt and t1.

When the build is done as make -j i.e. parallel but not distributed, this difference is irrelevant because all make targets are run on the same machine i.e. all the make instances access the same filesystem.

But let's consider what could?/does? happen during a distributed build if the two targets are built on two separate machines

In both makefiles, the recipe for t2 would be run AFTER the recipe for a.txt/t1.

However, in makefile1 the dependency of t2 on a.txt is explicit i.e. distcc knows that to make t2 on a separate machine, it must send the file a.txt to that separate machine.

QUESTION

If makefile2 is run using distcc, without a synchronized distributed filesystem, and t2 is maked on another machine, will there be a build error because a.txt is not present on the other machine?
What are the options for a distributed Linux filesystem?

Maxim Egorushkin · Accepted Answer · 2020-02-18T18:12:55.820

3

distcc is merely a replacement for gcc. It uses local gcc to preprocess the source file, then sends it for compilation to another machine, receives back the object file and saves it into the local filesystem. distcc doesn't require a shared network filesystem or clock synchronization between the participating hosts.

There is alse new "pump" functionality that preprocesses on remote servers, but it doesn't require a shared network filesystem or clock synchronization either.

Your make always runs locally.

Answering your questions:

distcc doesn't run make. make runs distc instead of gcc. make examines dependencies and their timestamps locally.
make runs locally and it doesn't care whether the filesystem it uses is local or networked.

edited Feb 18 '20 at 18:12

answered Feb 18 '20 at 17:59

Maxim Egorushkin

131,725
17
180
271

It makes sense that only the compile command, rather than an entire `make` recipe, is run on other machines. Then, you don't need to care about my case at all. – Bob Feb 18 '20 at 18:14
1

@Bob The recipe also runs locally. It runs `distcc` locally, which, under the hood, communicates with `distcc` servers. – Maxim Egorushkin Feb 18 '20 at 18:16

distcc - are there cases it requires a synchronized network filesystem

1 Answers1