22

We all know that downloading dependencies with npm can be very time consuming, specially when we are limited to old npm versions.

For me, as a developer, this wasn't such a big deal because I had to do this very few times on my local development machine and everything worked with the node_modules cache in my project's folder. But now I want to take this the applications to a CI environment, with Jenkins.

I realized a huge ammount of time was spent on downloading dependencies with npm. This is a problem because:

  1. npm downloads the dependencies in the project's folder, not a global folder such as Maven's /home/user/.m2

  2. I have to clean up the Jenkins workspace folder in every run to avoid issues with the git checkout.

I want a very elegant solution for caching the npm dependencies on my Jenkins slaves, but so far I can only think of:

  1. Removing everything but the node_modules folders from the Jenkins workspace. I don't like this because I could consume lots of HDD if I keep creating branches for my project. Each branch creates a workspace.

  2. doing something like cp ./node_modules /home/npm_cache after every npm install and then cp /home/npm_cache ./node_modules after the code checkout.

I feel these solutions are terrible. There must be a better way to do this.

AFP_555
  • 2,392
  • 4
  • 25
  • 45
  • I am not sure how long it is taking to download the NPM dependencies in your case but we minimized the time by adding Nexus as proxy manager in between rather than downloading the dependencies from Internet every time. Certainly the options you mentioned are also followed in general too. – Seshagiri Oct 22 '17 at 12:21
  • did you find a better solution? npm downloading takes over half the time of our build :( – santiagozky Jan 11 '18 at 17:56

6 Answers6

5

NPM has a global cache stored in ~/.npm

Steven
  • 51
  • 1
  • 2
  • 3
  • Some dependencies have postinstall steps which download binaries and may not use caching as you'd like (such as puppeteer, which downloads chrome every time). – justin.m.chase Dec 22 '20 at 17:08
5

What I have done in my Jenkins pipeline for 3 different projects is using tar instead of cp and then npm install instead of npm ci, for each:

  1. cd to your project
  2. npm i
  3. tar cvfz ${HOME}/your_project_node_modules.tar.gz node_modules

Then in the pipeline:

dir(your_project){
  sh "tar xf ${HOME}/your_project_node_modules.tar.gz"
  sh "npm i"
}

Of course it has the disadvantage that with time dependencies change and the install will take longer, but I've managed to reduce my disk space usage in the image by about 0.5GB and tar is much faster then cp (cp ~30 sec, tar ~5 sec)

Total install time went in my case from about 3 minutes to a matter of seconds.

Moshisho
  • 2,781
  • 1
  • 23
  • 39
  • Thanks! But I have the following problem: it takes too long to extract the archive content for some reason. I saw the answer that I need to use `--occurence=1` flag, but the OS is Windows. Maybe you know how to fix this problem? – excelsiorious Sep 10 '20 at 22:44
  • AFAIK, Windows doesn't have `tar`, do you have Cygwin or something? this might be the issue. – Moshisho Sep 13 '20 at 09:24
  • I use `sh.exe`, provided by git. I tried to find an answer in other sources and now I think that this expected behavior. Extraction of the folder from `.tar.gz` (size of data after extraction is ~600 MB) lasts 10-15 mins. But correct me please if I'm wrong – excelsiorious Sep 21 '20 at 23:42
  • For Windows I would prefer 7zip, see [here](https://stackoverflow.com/a/18180154/2470092) – Moshisho Sep 22 '20 at 06:47
  • For my case extraction speed was not improved – excelsiorious Sep 22 '20 at 10:17
4

you can just use pnpm.io which will make your build significantly faster (also locally). It uses the same API as npm.

[or]

Those parts of the Jenkinsfile will do the following:

On Branch master and develop, a fresh npm install is always executed.

On all other branches, the package.json will be md5 hashed and after the npm install the node_modules folder will be placed in the defined cache folder like: <CACHE_DIRECTORY>/<MD5_SUM_PACKAGE_JSON>/node_modules.

The next build can reuse the node_modules and doesn't have to download all the node_modules again.

parameters {
    booleanParam(name: "CACHED_NODE_MODULES",
            description: "Should node_modules be taken from cache?",
            defaultValue: !'master'.equals(env.BRANCH_NAME) && !'develop'.equals(env.BRANCH_NAME))
}

...

stage('Build') {
   steps {
      cacheOrRestoreNodeModules()
      echo "Performing npm build..."
      sh 'npm install'

   }
}

...

def cacheOrRestoreNodeModules() {
    if (params.CACHED_NODE_MODULES) {
        sh '''
        MD5_SUM_PACKAGE_JSON=($(md5sum package.json))
        CACHE_FOLDER=/home/jenkins/.cache/npm/${MD5_SUM_PACKAGE_JSON}
        
        # check if folder exists and copy node_modules to current directory
        if [ -d ${CACHE_FOLDER} ]; then
          cp -r ${CACHE_FOLDER}/node_modules .
        fi
        
        npm install --no-audit
        
        # if folder does not exists, create it and cache node_modules folder
        if ! [ -d ${CACHE_FOLDER} ]; then
          mkdir -p ${CACHE_FOLDER}
          cp -r node_modules ${CACHE_FOLDER}/node_modules
        fi
        '''
    }
}
seenukarthi
  • 8,241
  • 10
  • 47
  • 68
schowave
  • 306
  • 2
  • 12
  • 1
    Or you can just use https://pnpm.io/ which will make your build significantly faster (also locally). It uses the same API as npm... – schowave Sep 07 '21 at 00:26
  • thanks for answer and comment - btw your comment was samelike useful (and maybe easier), so lets embed into answer. – T.Todua Mar 04 '22 at 20:17
  • [ here is also another variation of the script: https://dev.to/khsing/speed-up-jenkins-with-npm-build-3pc ] – T.Todua Mar 04 '22 at 20:20
  • never use npm install on a ci environment - unintended consequences – evanjmg Apr 08 '22 at 16:07
1

I created such script to check md5sum of package.json in Jenkins:

stage('NPM Build') {
  steps {
    sh '''
    node -v && npm -v
    '''
    // rm -rf node_modules
    sh '''
    CACHE_FOLDER=${HOME}/.cache/md5
    echo "EXECUTOR_NUMBER: ${EXECUTOR_NUMBER}"
    MD5_FILE_NAME=package-json_${EXECUTOR_NUMBER}.md5sum

    [ -d ${CACHE_FOLDER} ] || mkdir -p ${CACHE_FOLDER}
    ls ${CACHE_FOLDER}

    if [ -f ${CACHE_FOLDER}/${MD5_FILE_NAME} ];then
      cp ${CACHE_FOLDER}/${MD5_FILE_NAME} ${MD5_FILE_NAME}
      md5sum package.json
      cat ${MD5_FILE_NAME}
      md5sum -c ${MD5_FILE_NAME} || npm ci
    else
      echo "No md5sum backup"
      npm ci
    fi

    echo "create new md5sum backup"
    md5sum package.json
    md5sum package.json > ${MD5_FILE_NAME}
    cp ${MD5_FILE_NAME} ${CACHE_FOLDER}
    '''
    sh '''
    npm run ngcc
    '''
    sh '''
    npm run build
    '''
  }
}
Cyclion
  • 738
  • 9
  • 9
1

I have chosen to run every build in a fresh docker container, but dependencies caching can still be done. This is what I have done:

  • Each project has a cache for npm packages, which are zipped they are zipped in a file containing the node_modules folder. These zip are all stored in /home/.cache/node_modules folder inside the host (the node where the build is run). So, when starting the docker container, it must have a bind mount like
docker { 
    image dockerImage
    args "... -v \"/home/.cache/node_modules:/home/.cache/node_modules\""
}
  • I am using a shared library with a custom step for building, its implementation is more or less this one:
sh """#!/bin/bash -xe
    function getNodeModulesListHash {
        npm ls 2> /dev/null | md5sum | cut -d ' ' -f 1
    }
    
    frontendProjectHashZip="\$(echo "${project}" | md5sum | cut -d ' '  -f 1).tar"
    [[ -f "/home/.cache/node_modules/\$frontendProjectHashZip" ]] && tar -xf "/home/.cache/node_modules/\$frontendProjectHashZip"

    hashBeforeInstall="\$(getNodeModulesListHash)"
    npm install
    hashAfterInstall="\$(getNodeModulesListHash)"

    if [[ \$hashBeforeInstall != \$hashAfterInstall ]]
    then 
        tar -cf \$frontendProjectHashZip node_modules
        rm -f "/home/.cache/node_modules/\$frontendProjectHashZip"
        mv \$frontendProjectHashZip "/home/.cache/node_modules/\$frontendProjectHashZip"
    fi
"""

The getNodeModulesListHash is used to get the hash of the currently installed packages. This hash is computed before and after the npm install so that if their value is the same, then I do not need to recreate the zip file with node_modules but I can keep the one that I have initially extracted. The rest is pretty straightforward and the logic is very similar to what other users proposed.

Marco Luzzara
  • 5,540
  • 3
  • 16
  • 42
0

I dont' know node.js enough to know how to handle this on that side. But one simple way this could be handled on a Linux machine is to simply symlink the the cache directory to an external location right after you checkout from git. Each agent machine will maintain its own cache, but you would probably have to do that regardless of the solution.

I assume you have investigated the nodeJS plugin, and it can't do what you want.

Rob Hales
  • 5,123
  • 1
  • 21
  • 33
  • Yes, but with node the packages are downloaded next to the file package.json, which defines the dependencies. So, I would have to copy the package.json from the project to the cache folder, download dependencies in there and then create the symlink.... I'm not sure... I was looking for something very direct. – AFP_555 Oct 22 '17 at 18:15
  • Ah. I see. I told you I didn't know enought about node.js. :) Not that it solves the problem any better, but I would think you could make the symlink, copy the packages.json in, then download the dependencies. Still not very elegant. – Rob Hales Oct 22 '17 at 20:13