Disable cache for specific RUN commands

Question

I have a few RUN commands in my Dockerfile that I would like to run with -no-cache each time I build a Docker image.

I understand the docker build --no-cache will disable caching for the entire Dockerfile.

Is it possible to disable cache for a specific RUN command?

Once you disable the cache for a single command, if the result doesn't match past cached run, you'd need to rebuild all remaining steps. Is that your goal, or do you hope to only rebuild a single layer and somehow inject that into where prior cached data was stored? — BMitch, Jun 13 '16 at 20:35
I was hoping to rebuild specific layers, for example a "git pull" command. Right now the "git pull" command will be cached, even though the repo is updated. — Vingtoft, Jun 14 '16 at 07:31
It's easy enough to force a pull by passing an unused argument. But the result of that cached entry being rebuilt is that all following layers will need a rebuild. See [my answer over here](http://stackoverflow.com/a/37798643/596285) for an example. — BMitch, Jun 14 '16 at 11:56
If looking to invalidate the cache when a git remote has changed take a look at: [How to prevent Dockerfile caching git clone](https://stackoverflow.com/a/39278224/2907791). All credit to [@anq](https://stackoverflow.com/users/243335/anq) for the linked answer. — hpgmiskin, Sep 09 '20 at 08:06

score 170 · Answer 1 · edited Jun 04 '20 at 00:39

170

There's always an option to insert some meaningless and cheap-to-run command before the region you want to disable cache for.

As proposed in this issue comment, one can add a build argument block (name can be arbitrary):

ARG CACHEBUST=1

before such region, and modify its value each run by adding --build-arg CACHEBUST=$(date +%s) as a docker build argument (value can also be arbitrary, here it is current datetime, to ensure its uniqueness across runs).

This will, of course, disable cache for all following blocks too, as hash sum of the intermediate image will be different, which makes truly selective cache disabling a non-trivial problem, taking into account how docker currently works.

edited Jun 04 '20 at 00:39

Pang

9,564
146
81
122

answered Apr 11 '18 at 10:26

Vladislav

1,811
1
8
10

4

Doesn't seem to work anymore, just got `---> Using cache` under my ``ARG CACHEBUST=1` line... (and yes I did do `--build-arg CACHEBUST=$(date +%s)` in my docker command) – Pylinux Jul 22 '19 at 04:51
Does not work for me either, maybe it is platform dependent. I would have expected any ARG change to invalidate the cache. – Oliver Mar 31 '20 at 16:23
27

You have to add `RUN echo "$CACHEBUST"` as just using `ARG` will not invalidate the cache – Sidharth V Apr 30 '20 at 15:31
This answer solved my issue here: https://stackoverflow.com/questions/63709147/run-npm-update-in-docker-without-using-the-cache-on-that-specific-update/63737657#63737657 – shapiro yaacov Sep 04 '20 at 08:37
3

Confirm working as of Podman `3.2.2` and Docker `20.10.6`. Side note: please do not assign the default value as in `ARG CACHEBUST=1`, instead just write `ARG CACHEBUST`. The difference is that in later case, may you forget to pass the `--build-arg`, you'll get a warning `WARN[0000] missing "CACHEBUST" build argument. […]` At least Podman behaves this way. – Hi-Angel Aug 25 '21 at 08:58
1

It is also possible to define a label instead of the echo, i.e. `LABEL cachebust=$CACHEBUST`. Better choices could be of course `buildDate` or `buildNumber` or something else providing valuable meta data. – B. Baron Jun 15 '22 at 11:36
To summarize, the build command assigned a variable value to CACHBUST ( --build-arg CACHEBUST="{{ date_cmd_results.stdout }}") . In the dockerfile, before the RUN git clone... command, you execute "ARG CACHEBUST=1 RUN echo "$CACHEBUST" " ? Did I understand correctly? – majorgear May 22 '23 at 17:24

score 73 · Answer 2 · edited Nov 18 '19 at 21:28

73

Use

ADD "https://www.random.org/cgi-bin/randbyte?nbytes=10&format=h" skipcache

before the RUN line you want to always run. This works because ADD will always fetch the file/URL and the above URL generates random data on each request, Docker then compares the result to see if it can use the cache.

I have also tested this and works nicely since it does not require any additional Docker command line arguments and also works from a Docker-compose.yaml file :)

edited Nov 18 '19 at 21:28

Slava Fomin II

26,865
29
124
202

answered Nov 11 '19 at 12:19

steve

3,230
1
19
14

20

whats going to happen if random.org decides to change that endpoint? how would you control that behaviour? – Andre Leon Rangel Jun 12 '20 at 02:42
2

@AndresLeonRangel Admittedly this is not a Docker feature but kind of a hack using Docker syntax and well known web service that has been around for 20+ years, however you are right in saying them may deprecate that endpoint, in fact looking at their docs now I can't even find the "randbyte" endpoint and they have a new API currently in beta. You can either 1) continue to use this endpoint until it fails, 2) use their new endpoint (until it fails) or 3) write your own random endpoint in which case you are in full control :) – steve Jun 15 '20 at 09:29
6

This failed some times... when site is down!!! I think it's not perfect solution for this. ADD failed: failed to GET https://www.random.org/cgi-bin/randbyte?nbytes=10&format=h with status 503 Service Unavailable: – Kathi Jun 22 '20 at 05:07
7

random.org has added DDOS protection which breaks this solution now – Brad Root Jun 22 '20 at 20:58
It doesn't work and given addess returns 503. If you don't want to block your pipelines do not use this solution – OlegI Jun 26 '20 at 09:58
1

@OlegI The solution does work, as Brad has mentioned random.org have added DDOS protection so using that URL for this solution is probably now not viable, however writing your own simple endpoint that returns a series of random bytes is trivial and could be hosted on your build server guaranteeing that it works – steve Jun 27 '20 at 12:04
2

This is so smart. In my case I'm installing a bunch of sw from a rolling release repository. By just downloading the db file using ADD first I get the install step to re-run automatically as soon as there are newer packages to install and cache the steps automatically when the versions are the same. Awsome! :) – tkarls Mar 03 '21 at 13:02
3

Too funny, everyone is using this as solution and it becomes DDOS. – BoBoDev Mar 18 '21 at 21:30
4

not sure why noone highlighted yet, but this does add a layer with a useless `skipcache` file. using `ARG CACHEBUST` + `RUN echo $CACHEBUST` does not – Filipe Pina Sep 04 '21 at 00:00
Does it work before a COPY command ? – vhamon Sep 20 '21 at 09:39
4

why don't people use a local source of randomness like `/dev/urandom` ??? – jaksco Oct 25 '22 at 17:00
@jaksco I thought same initially, but it fails with "failed to compute cache key /dev/urandom not found" - urandom is an endless stream which on its own would ADD forever - I would not have expected the "not found" error though – steve Oct 27 '22 at 01:30
3

We finally killed it. Alternative: http://www.randomnumberapi.com/api/v1.0/random?min=100&max=1000&count=5 – Antony Woods Oct 27 '22 at 16:09
1

@steve you need to mount it first `-v /dev/urandom:/dev/random` – jaksco Nov 04 '22 at 07:10
2

Adding such an external dependency you do not control is bad. If you deem it acceptable, you deserve anything that happens to you due to that. Its unavailability causing builds to break? Unexpected large responses bloating the image? Responses containing malicious code that becomes a part of an exploit, or at least triggering alerts in image security scanners? Everything. – Palec Nov 05 '22 at 20:38

score 24 · Answer 3 · answered Jan 17 '21 at 14:56

24

If your goal is to include the latest code from Github (or similar), one can use the Github API (or equivalent) to fetch information about the latest commit using an ADD command.
docker build will always fetch an URL from an ADD command, and if the response is different from the one received last time docker build ran, it will not use the subsequent cached layers.

eg.

ADD "https://api.github.com/repos/username/repo_name/commits?per_page=1" latest_commit
RUN curl -sLO "https://github.com/username/repo_name/archive/main.zip" && unzip main.zip

answered Jan 17 '21 at 14:56

Guillaume Boudreau

2,676
29
27

2

This is great. When the code changed, my git clone and tests run; when the code has not changed, cached values are used. Perfection. – n13 Aug 10 '22 at 04:37
This should be the accepted solution, very clean and simple, just include the `ADD` step before whatever you currently do to install the package. Please upvote this answer to raise attention to it. – David Parks Apr 24 '23 at 16:10

score 17 · Answer 4 · answered Jan 21 '22 at 05:06

Building on @Vladislav’s solution above I used in my Dockerfile

ARG CACHEBUST=0

to invalidate the build cache from hereon.

However, instead of passing a date or some other random value, I call

docker build --build-arg CACHEBUST=`git rev-parse ${GITHUB_REF}` ...

where GITHUB_REF is a branch name (e.g. main) whose latest commit hash is used. That means that docker’s build cache is being invalidated only if the branch from which I build the image has had commits since the last run of docker build.

score 11 · Answer 5 · answered Feb 02 '16 at 13:25

11

As of February 2016 it is not possible.

The feature has been requested at GitHub

answered Feb 02 '16 at 13:25

Vingtoft

13,368
23
86
135

score 9 · Answer 6 · answered Feb 01 '16 at 16:26

9

Not directly but you can divide your Dockerfile in several parts, build an image, then FROM thisimage at the beginning of the next Dockerfile, and build the image with or without caching

answered Feb 01 '16 at 16:26

user2915097

30,758
6
57
59

2

Will this enable updating the commited layers in the base docker image? – user_mda Jan 09 '17 at 17:24

score 8 · Answer 7 · answered Nov 11 '19 at 11:16

the feature added a week ago.

ARG FOO=bar

FROM something
RUN echo "this won't be affected if the value of FOO changes"
ARG FOO
RUN echo "this step will be executed again if the value of FOO changes"

FROM something-else
RUN echo "this won't be affected because this stage doesn't use the FOO build-arg"

https://github.com/moby/moby/issues/1996#issuecomment-550020843

score 5 · Answer 8 · answered Sep 29 '20 at 12:47

I believe that this is a slight improvement on @steve's answer, above:

RUN git clone https://sdk.ghwl;erjnv;wekrv;qlk@gitlab.com/your_name/your_repository.git

WORKDIR your_repository

# Calls for a random number to break the cahing of the git clone
# (https://stackoverflow.com/questions/35134713/disable-cache-for-specific-run-commands/58801213#58801213)
ADD "https://www.random.org/cgi-bin/randbyte?nbytes=10&format=h" skipcache
RUN git pull

This uses the Docker cache of the git clone, but then runs an uncached update of the repository.

It appears to work, and it is faster - but many thanks to @steve for providing the underlying principles.

Nice. Keep in mind this breaks when there are rewrites – sehe Feb 10 '21 at 00:20 — sehe, Feb 10 '21 at 00:20

score -5 · Answer 9 · answered Apr 19 '19 at 02:26

-5

Another quick hack is to write some random bytes before your command

RUN head -c 5 /dev/random > random_bytes && <run your command>

writes out 5 random bytes which will force a cache miss

answered Apr 19 '19 at 02:26

Mark

1,374
11
12

16

The result of writing those random bytes gets cached as well, so if no files have changed before that command, it won't run the command again. This doesn't solve anything. – Icy Defiance Jun 24 '19 at 13:31

Disable cache for specific RUN commands

9 Answers9

Linked

Related