Why COPY package*.json ./ precedes COPY . .?

Question

In this Node.js tutorial on Docker: https://nodejs.org/en/docs/guides/nodejs-docker-webapp/

What is the point of COPY package*.json ./?

Isn't everything copied over with COPY . .?

The Dockerfile in question:

FROM node:8

# Create app directory
WORKDIR /usr/src/app

# Install app dependencies
# A wildcard is used to ensure both package.json AND package-lock.json are copied
# where available (npm@5+)
COPY package*.json ./

RUN npm install
# If you are building your code for production
# RUN npm install --only=production

# Bundle app source
COPY . .

EXPOSE 8080
CMD [ "npm", "start" ]

I pretty sure that remove `COPY package*.json ./` still work if you put `RUN npm install` under `COPY . .` .Same question with you, I think that unnecessary... — Truong Dang, Jul 26 '18 at 08:53
Does `COPY package*.json` as well includethe packge-lock.json file? If not what is the star for then? — Eldar Omerovic, Nov 17 '21 at 12:16
@EldarOmerovic , Yes. COPY package*.json, copying package.json file and package-lock.json file if it exists. because package*.json means any file with the filename beginning with package keyword and ending with .json format. it can contain any words as a part of file name between package and .json words. — Jayani Sumudini, Jun 25 '23 at 08:42

score 68 · Accepted Answer · answered Jul 26 '18 at 10:09

68

This is a common pattern in Dockerfiles (in all languages). The npm install step takes a long time, but you only need to run it when the package dependencies change. So it's typical to see one step that just installs dependencies, and a second step that adds the actual application, because it makes rebuilding the container go faster.

You're right that this is essentially identical if you're building the image once; you get the same filesystem contents out at the end.

Say this happens while you're working on the package, though. You've changed some src/*.js file, but haven't changed the package.json. You run npm test and it looks good. Now you re-run docker build. Docker notices that the package*.json files haven't changed, so it uses the same image layer it built the first time without re-running anything, and it also skips the npm install step (because it assumes running the same command on the same input filesystem produces the same output filesystem). So this makes the second build run faster.

answered Jul 26 '18 at 10:09

David Maze

130,717
29
175
215

2

I thought it's a kind of corner cut where, provided `npm install`fails, we avoid proceeding to copying over application source. – Jakub Barczyk Jul 26 '18 at 19:58
2

But isn't the `node_modules` directory created in that first layer fully covered by the `node_modules` directory in the next layer that is copied over with the `COPY . .` command? Also, there could theoretically be newer module versions in the host dir with unchanged `package.json`. – Passiday Apr 25 '20 at 09:24
11

You also should include `node_modules` in a `.dockerignore` file so that the `COPY` step doesn't overwrite what just got `npm install`ed. – David Maze Apr 25 '20 at 11:15
this is just a lame docker design, please vote my comment and get noticed by docker @docker – Kevin Simple Mar 02 '22 at 03:01
I think in terms of consistency, it is good because docker uses the cached node_module. However, i think it won't facilitate the use of minor version in package.json, as it doesn't do `npm install` again therefore minor version is not downloaded. – v.ng Sep 28 '22 at 04:06

Deepak Kashyap · Answer 2 · 2022-12-27T15:50:48.227

During building of an image docker works on the basis of layer based architecture that is each line you write in a Dockerfile gets into the layer and gets cached... now the purpose of copying the package*.json file first is a kind of optimization you did in the Dockerfile during the building of an image if bcoz we want to run the command npm install only when some dependencies gets added into the project hence copying first package*.json into the image file system for every successive build runs npm install only when a new dependency gets added into the project and then just copy everything into the image filesystem then after docker is a headless pc of software it doesn't check a layer subsequent to the change of a layer it just executes after then ... hence we get saved each time without running npm install after copying the entire host file system into image file system

score 0 · Answer 3 · answered Jun 25 '23 at 08:46

The reason we first copy package.json & package-lock.json and install our dependencies before copying the rest of the application is speed and optimization.

Docker's images are integrated into layers and each line in a Dockerfile represents a layer. When you build an image, Docker tries to accelerate the construction time only by rebuilding the layer that has changed, along with the layers on top (those found below in the Dockerfile).

If we copy the entire code base before installing our dependencies, in each change we make during Docker development will have to reinstall all our units, despite the fact that most of the time they have not changed. While Docker will only execute the NPM installation if your package. JSON or package-lock.json has changed. If not, you will only copy the latest changes in your code base.

Building an image can take some time, so this is a healthy optimization that we want to use.

Why COPY package*.json ./ precedes COPY . .?

3 Answers3

Linked

Related