211

I'm wondering if we should be tracking node_modules in our repo or doing an npm install when checking out the code?

m90
  • 11,434
  • 13
  • 62
  • 112
Tolga E
  • 12,188
  • 15
  • 49
  • 61
  • 5
    Related: [Should I check in node_modules to git when creating a node.js app on Heroku?](http://stackoverflow.com/questions/11459475/should-i-check-in-node-modules-to-git-when-creating-a-node-js-app-on-heroku) – Benjamin Crouzier Feb 10 '14 at 11:04

9 Answers9

204

The answer is not as easy as Alberto Zaccagni suggests. If you develop applications (especially enterprise applications), including node_modules in your git repo is a viable choice and which alternative you choose depends on your project.

Because he argued very well against node_modules I will concentrate on arguments for them.

Imagine that you have just finished enterprise app and you will have to support it for 3-5 years. You definitely don't want to depend on someone's npm module which can tomorrow disappear and you can't update your app anymore.

Or you have your private modules which are not accessible from the internet and you can't build your app on the internet. Or maybe you don't want to depend on your final build on npm service for some reason.

You can find pros and cons in this Addy Osmani article (although it is about Bower, it is almost the same situation). And I will end with a quote from Bower homepage and Addy's article:

“If you aren’t authoring a package that is intended to be consumed by others (e.g., you’re building a web app), you should always check installed packages into source control.”

Josh Correia
  • 3,807
  • 3
  • 33
  • 50
ivoszz
  • 4,370
  • 2
  • 27
  • 27
  • I think you're right, This is an enterprise app and I don't want to depend on what happens to an open source project in the future – Tolga E Aug 09 '13 at 15:56
  • 7
    I agree with this entirely. I don't want our enterprise build system to *require* an Internet connection to make a successful build because it needs to download dependencies, which *hopefully* are still around. Thanks. – deadlydog Nov 04 '13 at 16:42
  • 1
    Another good article illustrating why it's a good idea to track `node_modules` in the repo if you're deploying an app and not maintaining a package: http://www.futurealoof.com/posts/nodemodules-in-git.html – Will Nov 06 '13 at 04:14
  • 4
    I've changed my mind a bit on this matter, I believe both views have advantages, but what I said is entirely "philosophical", this approach has a more direct impact if, for example, network is not accessible or whatever similar issue. – Alberto Zaccagni Nov 07 '13 at 19:20
  • 14
    @Alberto Zaccagni I believe you were right the first time. If you're really building an enterprise app, then you should be using enterprise tools. Artifactory and npm-artifactory should be used to protect against projects disappearing from the internet. Even on small projects this is cleaner than having several copies of the same thing checked into source control. – Ted Bigham Sep 13 '14 at 22:11
  • 11
    After the [left-pad issue](http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm), I think it's definitely not a bad idea to track node_modules. – Léo Lam Jun 04 '16 at 21:22
  • 6
    Important aspect noone mentioned. If your node_modules are under VCS – switching branches is just `git checkout foo`. If node_modules are not under VCS – switching branches is `git checkout foo ; npm install` and whatever your current NPM version requires to work ;) – Ivan Kleshnin Jul 09 '16 at 09:05
  • @IvanKleshnin I don't think that's true. Let's say you are checkout out a new feature. Why would you want to update the dependencies from the main project? You will do that when merging to the main branch. – Jorjon May 05 '17 at 08:37
  • @Veehmot you mean it does not always require npm install? Yes, and that's even worse because you can't automate that. A.t.m. I'm pretty convinced that it's a GIT flaw to have a single working directory and not N workind dirs (one per branch). It's possible to implement (some people even did prototypes). I used to have some proof links... somewhere. Whatever. – Ivan Kleshnin May 08 '17 at 09:19
  • 1
    I am working on an application where we only work on it every 2-3 months. The npm modules had often changed/broken on return (even webpack!). Can't have that level of unreliability, so checking in the modules. It should be that once modules are they they can't/shouldn't be removed (except for security issues). – JsAndDotNet Oct 04 '17 at 13:28
  • CHECK in node_modules! The way npm builds packages.json by default is describing the dependancies as "this version or greater". I have two identical machines, with the same application, both with identical package.json files. I did an npm install on ONE and now the two applications have 4,453 files that differ! And the application no longer runs! – Addinall Dec 11 '17 at 01:42
  • 8
    The cleanest enterprise solution would be to host an internal intranet-accessible npm repository that has all of the versions of modules that you use, and don't check in node_modules with the source code. Your build system would reference your internal node repository. – user2867288 May 21 '18 at 15:39
  • This answer is not a yay or nay, comes with platform compiling issues, and the longevity is easily mitigated by internal registries and Artifactory. Not to paint the shed, but imho, look at the other answers before doing this. – Vargr Oct 02 '18 at 12:07
  • With this approach, would also suggest to any native app to inluce all the shared libraries it depends on? – Benav Dec 31 '18 at 05:42
  • 1
    According to this document (https://docs.npmjs.com/files/package-lock.json) you don't (and shouldn't) have to check-in the node_modules directory so long as you make sure to check in the package-lock.json file (and I would say the yarn.lock file as well if you use that) every time it changes. – aimass Jan 06 '19 at 14:39
  • 1
    The argument that you shouldn't need an internet connection to build your software is tending towards archaism. Imagine if Dominos had adhered to an arbitrary edict that you shouldn't need a motor car to sell your pizza. – Colm Feb 28 '19 at 16:24
118

Modules details are stored in packages.json, that is enough. There's no need to checkin node_modules.

People used to store node_modules in version control to lock dependencies of modules, but with npm shrinkwrap that's not needed anymore.

Another justification for this point, as @ChrisCM wrote in the comment:

Also worth noting, any modules that involve native extensions will not work architecture to architecture, and need to be rebuilt. Providing concrete justification for NOT including them in the repo.

Josh Kelley
  • 56,064
  • 19
  • 146
  • 246
Alberto Zaccagni
  • 30,779
  • 11
  • 72
  • 106
  • 11
    Simple, and to the point +1. Also worth noting, any modules that involve native extensions will not work architecture to architecture, and need to be rebuilt. Providing concrete justification for NOT including them in the repo. – MobA11y Aug 08 '13 at 14:58
  • 3
    Not really, this is justification for using a reproducible dev environment using e.g. vagrant. It should only need to work on one architecture. – Robin Smith Sep 18 '15 at 12:50
25

I would recommend against checking in node_modules because of packages like PhantomJS and node-sass for example, which install the appropriate binary for the current system.

This means that if one Dev runs npm install on Linux and checks in node_modules – it won't work for another Dev who clones the repo on Windows.

It's better to check in the tarballs which npm install downloads and point npm-shrinkwrap.json at them. You can automate this process using shrinkpack.

Jamie Mason
  • 4,159
  • 2
  • 32
  • 42
  • But doesn't `npm install --global shrinkpack` itself then have the deferred weakness, by requiring other packages with which to then install the shrunken packages? This goes against Addy's Advice. – danjah Nov 02 '16 at 10:55
  • could you rephrase the question please @danjah? I don't fully understand you sorry. – Jamie Mason Nov 02 '16 at 10:58
  • From what you describe, dependence on `shrinkpack` is required to then reliably install build dependencies. Therefore, the installation of the build tool itself becomes the weakness to the argument against submitting all build dependencies to version control. – danjah Nov 02 '16 at 11:02
  • `shrinkpack` is not required at install-time, so installation of a shrinkpacked project is handled entirely by `npm`. No internet connection is even needed as everything has been provided ahead of time by `shrinkpack`. The environment which is doing the `npm install` doesn't need to know `shrinkpack` exists, it all just works on it's own. – Jamie Mason Nov 02 '16 at 11:22
  • 1. `npm install --ignore-scripts` 2. Check in 3. `npm install` 4. add generated files, if any, to ignored list 5. `npm install` as a part of a build process – Alex Che Apr 14 '18 at 12:46
  • 1
    I think that checking in the lock files is enough (package-lock.json; yarn.lock) at least according to TFM: https://docs.npmjs.com/files/package-lock.json – aimass Jan 06 '19 at 14:42
  • 1
    you would get a predictable dependency graph when using a lockfile, and not be susceptible to the issues discussed around PhantomJS and node-sass etc on different platforms. You would need an internet connection and for the registry to be up though of course. – Jamie Mason Jan 06 '19 at 15:15
9

This topic is pretty old, I see. But I'm missing some update to arguments provided here due to changed situation in npm's eco system.

I'd always advise not to put node_modules under version control. Nearly all benefits from doing so as listed in context of accepted answer are pretty outdated as of now.

  1. Published packages can't be revoked from npm registry that easily anymore. So you don't have to fear loosing dependencies your project has relied on before.

  2. Putting package-json.lock file in VCS is helping with frequently updated dependencies probably resulting in different setups though relying on same package.json file.

So, putting node_modules into VCS in case of having offline build tools might be considered the only eligible use case left. However, node_modules usually grows pretty fast. Any update will change a lot of files. And this is affecting repositories in different ways. If you really consider long-term affects that might be an impediment as well.

Centralized VCS' like svn require transferring committed and checked out files over the network which is going to be slow as hell when it comes to checking out or updating a node_modules folder.

When it comes to git this high number of additional files will instantly pollute the repository. Keep in mind that git isn't tracking differences between versions of any file, but is storing copies of either version of a file as soon as a single character has changed. Every update to any dependency will result in another large changeset. Your git repository will quickly grow huge because of this affecting backups and remote synchronization. If you decide to remove node_modules from git repository later it is still part of it for historical reasons. If you have distributed your git repository to some remote server (e.g. for backup) cleaning it up is another painful and error-prone task you'd be running into.

Thus, if you care for efficient processes and like to keep things "small" I'd rather use a separate artifacts repository such as Nexos Repository (or just some HTTP server with ZIP archives) providing some previously fetched set of dependencies for download.

Thomas Urban
  • 4,649
  • 26
  • 32
6

Not tracking node_modules with source control is the right choice because some NodeJS modules, like MongoDB NodeJS driver, use NodeJS C++ add-ons. These add-ons are compiled when running npm install command. So when you track node_modules directory, you may accidentally commit an OS specific binary file.

M.Z.
  • 424
  • 5
  • 12
4

I agree with ivoszz that it's sometimes useful to check the node_modules folder, but...


scenario 1:

One scenario: You use a package that gets removed from npm. If you have all the modules in the folder node_modules, then it won't be a problem for you. If you do only have the package name in the package.json, you can't get it anymore. If a package is less than 24 hours old, you can easily remove it from npm. If it's older than 24 hours old, then you need to contact them. But:

If you contact support, they will check to see if removing that version of your package would break any other installs. If so, we will not remove it.

read more

So the chances for this are low, but there is scenario 2...


scenario 2:

An other scenario where this is the case: You develop an enterprise version of your software or a very important software and write in your package.json:

"dependencies": {
    "studpid-package": "~1.0.1"
}

You use the method function1(x)of that package.

Now the developers of studpid-package rename the method function1(x)to function2(x) and they make a fault... They change the version of their package from 1.0.1 to 1.1.0. That's a problem because when you call npm install the next time, you will accept version 1.1.0 because you used the tilde ("studpid-package": "~1.0.1").

Calling function1(x) can cause errors and problems now.


But:

Pushing the whole node_modules folder (often more than 100 MB) to your repository, will cost you memory space. A few kb (package.json only) compared with hundreds of MB (package.json & node_modules)... Think about it.

You could do it / should think about it if:

  • the software is very important.

  • it costs you money when something fails.

  • you don't trust the npm registry. npm is centralized and could theoretically be shut down.

You don't need to publish the node_modules folder in 99.9% of the cases if:

  • you develop a software just for yourself.

  • you've programmed something and just want to publish the result on GitHub because someone else could maybe be interested in it.


If you don't want the node_modules to be in your repository, just create a .gitignore file and add the line node_modules.

hardfork
  • 2,470
  • 1
  • 23
  • 43
  • 1
    One more disadvantage of "publishing the node_modules folder" could be: Calling `npm install` on Windows and MacOS could generate different files (OS-dependant files) in some packages. But I'm not sure about that. Can someone verify that this is true? – hardfork Jan 01 '19 at 14:15
  • 2
    "scenario 2": that's why you commit `package-lock.json`. If there's a problem in future with an update of studpid-package, you can roll back the lock file to find out the exact version that did work for you. – ToolmakerSteve Oct 30 '19 at 20:05
2

I would like to offer a middle of the road alternative.

  1. Don't add node_modules into git.
  2. Use a package-lock.json file to nail down your dependency versions.
  3. In your CI or release process, when you release a version make a copy of the node_modules folder and back it up (e.g. in cloud storage).

In the rare event that you cannot access NPM (or other registries you use) or a specific package in NPM, you have a copy of node_modules and can carry on working until you restore access.

Martin Capodici
  • 1,486
  • 23
  • 27
  • This answer states the **best practices**. Although the `package-lock.json` only came in the later versions. Maybe the early implementers of NodeJS didn't have this methodology back then. – Abel Callejo Mar 12 '21 at 03:47
0

One more thing to consider: checking in node_modules makes it harder / impossible to use the difference between dependencies and devDependencies.

On the other hand though, one could say it's reassuring to push to production the exact same code that went through tests - so including devDependencies.

Jan Żankowski
  • 8,690
  • 7
  • 38
  • 52
  • "to production the exact same code that went through tests": That's what you have Docker for. Or an os package manager, like rpm. You don't rebuild the code between test and prod, do you? devDependencies helped build the final code, but has no place in a deployment, neither in test nor prod. – Per Wiklander Oct 18 '17 at 11:08
  • Would it help if the devDependencies were in their own package.json one directory higher than the "src" directory? Since node modules are searched for starting in the current directory and then moving up, you should still be use your dev dependencies and have separation of dev/src modules. – Alex Nov 10 '17 at 02:52
0

node_modules is not required to be checked-in if dependencies are mentioned in package.json. Any other programmer can simply get it by doing npm install and the npm is smart enough to make the node_modules in you working directory for the project.

Himanshu
  • 1
  • 1
  • 2