33

My repository in my organisation's devops project contains a lot of .net solutions and some unity projects as well. When I run my build pipeline, it fails due to several of these:

Error MSB3491: Could not write lines to file "obj\Release\path\to\file". There is not enough space on the disk.

I would like the pipeline to only checkout and fetch parts of the repository that are required for a successful build. This might also help with execution time of the pipeline since it currently also fetches the whole of my unity projects with gigabytes of resources which takes forever.

I would like to spread my projects across multiple repositories but the admin won't give me more than the one I already have. It got a lot better when I configured git fetch as shallow (--depth=1) but I still get the error every now and then.

This is how I configured the checkout:

steps:
- checkout: self
  clean: true
  # shallow fetch
  fetchDepth: 1
  lfs: false
  submodules: false

The build is done using VSBuild@1 task.

I can't find a valid solution to my problem except for using multiple repositories, which is not an option right now.

Edit: Shayki Abramczyk's solution #1 works perfectly. Here is my full implementation.

GitSparseCheckout.yml:

parameters:
  access: ''
  repository: ''
  sourcePath: ''

steps:
- checkout: none

- task: CmdLine@2
  inputs:
    script: |
      ECHO ##[command] git init
      git init
      ECHO ##[command] git sparse-checkout: ${{ parameters.sourcePath }}
      git config core.sparsecheckout true
      echo ${{ parameters.sourcePath }} >> .git/info/sparse-checkout
      ECHO ##[command] git remote add origin https://${{ parameters.repository }}
      git remote add origin https://${{ parameters.access }}@${{ parameters.repository }}
      ECHO ##[command] git fetch --progress --verbose --depth=1 origin master
      git fetch --progress --verbose --depth=1 origin master
      ECHO ##[command] git pull --progress --verbose origin master
      git pull --progress --verbose origin master

Checkout is called like this (where template path has to be adjusted):

- template: ../steps/GitSparseCheckout.yml
  parameters:
    access: anything:<YOUR_PERSONAL_ACCESS_TOKEN>
    repository: dev.azure.com/organisation/project/_git/repository
    sourcePath: path/to/files/
MikeLimaSierra
  • 799
  • 2
  • 11
  • 29

5 Answers5

28

In Azure DevOps you don't have option to get only part of the repository, but there is a workaround: Disable the "Get sources" step and get only the source you want by manually executing the according git commands in a script.

To disable the default "Get Sources" just specify none in the checkout statement:

- checkout: none

In the pipeline add a CMD/PowerShell task to get the sources manually with one of the following 2 options:

1. Get only part of the repo with git sparse-checkout. For example, get only the directories src_1 and src_2 within the test folder (lines starting with REM ### are just the usual batch comments):

- script: |
    REM ### this will create a 'root' directory for your repo and cd into it
    mkdir myRepo
    cd myRepo
    REM ### initialize Git in the current directory
    git init
    REM ### set Git sparsecheckout to TRUE
    git config core.sparsecheckout true
    REM ### write the directories that you want to pull to the .git/info/sparse-checkout file (without the root directory)
    REM ### you can add multiple directories with multiple lines
    echo test/src_1/ >> .git/info/sparse-checkout
    echo test/src_2/ >> .git/info/sparse-checkout
    REM ### fetch the remote repo using your access token
    git remote add -f origin https://your.access.token@path.to.your/repo
    REM ### pull the files from the source branch of this build, using the build-in Azure DevOps variable for the branch name
    git pull origin $(Build.SourceBranch)
  displayName: 'Get only test/src_1 & test/src_2 directories instead of entire repository'

Now in the builds task make myRepo the working directory. Fetching the remote repo using an access token is necessary, since using checkout: none will prevent your login credentials from being used. In the end of the pipeline you may want to add step to clean the myRepo directory.

2. Get parts of the repo with Azure DevOps Rest API (Git - Items - Get Items Batch).

bradib0y
  • 967
  • 5
  • 15
Shayki Abramczyk
  • 36,824
  • 16
  • 89
  • 114
6

The other answers work well but I found a different way using potentially newer features of git.

This will fetch to a depth of 1 and show all the files in the root folder plus folder1, folder2 and folder3

        - task: CmdLine@2
          inputs:
            script: |
              git init
              git sparse-checkout init --cone
              git sparse-checkout set folder1 folder2 folder3
              git remote add origin https://<github-username>:%GITHUB_TOKEN%@<your-git-repo>
              git fetch --progress --verbose --depth=1 origin
              git switch develop
          env:
            GITHUB_TOKEN: $(GITHUB_TOKEN)
Peter Grainger
  • 4,539
  • 1
  • 18
  • 22
  • This is works great! I added "git clean -ffdx" after git init. Our repository is huge, so this helped save a lot of time. Thanks! – Tim Wilson Sep 15 '20 at 14:20
  • We ended up not needing to perform a git clean. We just cleaned the workspace at the start of the job (since we recently switched to self-hosted agents - not needed for Microsoft-hosted). – Tim Wilson Sep 15 '20 at 18:49
  • doesn't seem to work. Still download the whole thing. – user1324887 Sep 22 '20 at 22:50
  • @user1324887 maybe it's your version of git, this assumes the latest version – Peter Grainger Sep 23 '20 at 14:40
  • 1
    This is on Azure Dev ops with vsts on latest version. I changed it to `git clone --filter=blob:none --depth 1 --sparse REPO_URL` followed by sparse settings and `git sparse-checkout add FOLDER_OR_FILE` and that worked – user1324887 Sep 23 '20 at 17:06
4

Maybe it is helpful for you to check out only a specific branch. This works by:

resources:
  repositories:
  - repository: MyGitHubRepo
    type: github
    endpoint: MyGitHubServiceConnection
    name: MyGitHubOrgOrUser/MyGitHubRepo
    ref: features/tools

steps:
- checkout: MyGitHubRepo

Or by using the inline syntax like so

- checkout: git://MyProject/MyRepo@features/tools # checks out the features/tools branch
- checkout: git://MyProject/MyRepo@refs/heads/features/tools # also checks out the features/tools branch
- checkout: git://MyProject/MyRepo@refs/tags/MyTag # checks out the commit referenced by MyTag.

More information can be found here

Vertexwahn
  • 7,709
  • 6
  • 64
  • 90
Matthias Güntert
  • 4,013
  • 6
  • 41
  • 89
  • 11
    I don't think achieves what is asked, this resolves to checkout certain branch or tag. What is asked here is to get certain path even in master branch (to checkout only one project in a monorepo) – Kat Lim Ruiz Apr 18 '21 at 00:21
  • 1
    Indeed, I must have misunderstood. I will still leave it. – Matthias Güntert Apr 19 '21 at 07:21
  • @MatthiasGüntert No! Delete it! I upvoted it before I realized this does not solve the issue - was a mistake. Can not downvote it again, since votes are looked at for me now... – Vertexwahn Feb 18 '22 at 15:49
4

With LFS support on Ubuntu and Windows agents

parameters:
  folders: '*'

steps:
- bash: |
      set -ex
      export ORIGIN=$(Build.Repository.Uri)
      export REF=$(Build.SourceVersion)
      export FOLDERS='${{ parameters.folders }}'
      git version
      git lfs version
      git init
      git sparse-checkout init --cone
      git sparse-checkout add $FOLDERS
      git remote add origin $ORIGIN
      git config core.sparsecheckout true
      git config gc.auto 0
      git config advice.detachedHead false
      git config http.version HTTP/1.1
      git lfs install --local
      git config uploadpack.allowReachableSHA1InWant true
      git config http.extraheader "AUTHORIZATION: bearer $(System.AccessToken)"
      git fetch --force --no-tags --progress --depth 1 origin $REF
      git checkout $REF --progress --force
  displayName: Fast sparse Checkout

Then use as a step

  steps:
  - checkout: none

  - template: fastCheckout.yaml
    parameters:
      folders: 'Folder1 src/Folder2'

You can pass folders as paramters

The exports are there to make it easier to test the script locally.

Improved checkouts from 10mins to 2mins

Michael Blake
  • 2,068
  • 2
  • 18
  • 31
3

A Solution For Pull Request and Master Support

I realized after posting this solution it is similar to the updated one on the post. However this solution is a bit more rich and optimized. But most importantly this solution uses the pull request merge branch in Dev Ops for the deployments like the native checkouts do. It also fetches only the needed commits.

  • Supports multiple folder/path patterns as parameters
  • Minimal checkout with the bare minimum needed via sparse checkout
  • Shallow depth, multithreaded fetch, with a sparse index.
  • It takes into account using the PR merge branch against main rather than the raw PR branch itself if needed.
  • Uses native System Token already in pipeline
  • Handles detection and alternative ref flows for master where a merge branch does not exist.

Example Use in your Script:

  - job: JobNameHere
    displayName: JobDisplayName Here
    steps:
      - template: templates/sparse-checkout.yml
        parameters:
          checkoutFolders:
            - /Scripts
            - /example-file.ps1
   # other steps

templates/sparse-checkout.yaml

parameters:
- name: checkoutFolders  
  default: '*'
  type: object

steps:
  - checkout: none
  - task: PowerShell@2
    inputs:
        targetType: inline
        script: |
            $useMasterMergeIfAvaiable = $true
            
            $checkoutFolders = ($env:CheckoutFolders | ConvertFrom-Json)
            Write-Host $checkoutFolders
            
            $sw = [Diagnostics.Stopwatch]::StartNew() # For timing the run.

            $checkoutLocation = $env:Repository_Path
            
            ################ Setup Variables ###############
            $accessToken = "$env:System_AccessToken";
            $repoUriSegments = $env:Build_Repository_Uri.Split("@");
            $repository = "$($repoUriSegments[0]):$accessToken@$($repoUriSegments[1])"
            $checkoutBranchName = $env:Build_SourceBranch;
            $prId = $env:System_PullRequest_PullRequestId;
            $repositoryPathForDisplay = $repository.Replace("$accessToken", "****");
            $isPullRequest = $env:Build_Reason -eq "PullRequest";

            ################ Configure Refs ##############
            if ($isPullRequest)
            {
                Write-Host "Detected Pull Request"
                $pullRequestRefMap = "refs/heads/$($checkoutBranchName):refs/remotes/origin/pull/$prId"
                $mergeRefMap = "refs/pull/$prId/merge:refs/remotes/origin/pull/$prId";
                $mergeRefRemote = $mergeRefMap.Split(":")[0];

                $remoteMergeBranch = git ls-remote $repository "$mergeRefRemote"  # See if remote merge ref exiss for PR.
                if ($useMasterMergeIfAvaiable -and $remoteMergeBranch)
                {
                    Write-Host "Remote Merge Branch Found: $remoteMergeBranch" -ForegroundColor Green
                    $refMapForCheckout = $mergeRefMap
                    $remoteRefForCheckout = "pull/$prId/merge"
                }else{
                    Write-Host "No merge from master found (or merge flag is off in script), using pullrequest branch." -ForegroundColor Yellow
                    $refMapForCheckout = $pullRequestRefMap
                    $remoteRefForCheckout = "heads/$checkoutBranchName"
                }
                $localRef = "origin/pull/$prId"
            }else{
                Write-Host "This is not a pull request. Assuming master branch as source."
                $localRef = "origin/master"
                $remoteRefForCheckout = "master"
            }

            ######## Sparse Checkout ###########
            Write-Host "Beginning Sparse Checkout..." -ForegroundColor Green;
            Write-Host " | Repository: $repositoryPathForDisplay" -ForegroundColor Cyan
            if (-not (test-path $checkoutLocation) ) {
                $out = mkdir -Force $checkoutLocation
            }
            $out = Set-Location $checkoutLocation
            git init -q
            git config core.sparsecheckout true
            git config advice.detachedHead false
            git config index.sparse true
            git remote add origin $repository
            git config remote.origin.fetch $refMapForCheckout
            git sparse-checkout set --sparse-index $checkoutFolders
            Write-Host " | Remote origin configured. Fetching..."
            git fetch -j 4 --depth 1 --no-tags -q origin $remoteRefForCheckout
            Write-Host " | Checking out..."
            git checkout $localRef -q

            Get-ChildItem -Name 
            # tree . # Shows a graphical structure - can be large with lots of files.
            ############ Clean up ##################
            if (Test-Path -Path ..\$checkoutLocation)
            {
                Write-Host "`nChecked Out`n#############"
                Set-Location ../
            }
            $sw.Stop()
            Write-Host "`nCheckout Complete in $($sw.Elapsed.TotalSeconds) seconds." -ForegroundColor Green
    displayName: 'Sparse Checkout'
    env:
        Build_Repository_Uri: $(Build.Repository.Uri)
        Build_Reason: $(Build.Reason)
        System_PullRequest_SourceBranch: $(System.PullRequest.SourceBranch)
        System_PullRequest_PullRequestId: $(System.PullRequest.PullRequestId)
        System_PullRequest_SourceRepositoryURI: $(System.PullRequest.SourceRepositoryURI)
        Build_BuildId: $(Build.BuildId)
        Build_SourceBranch: $(Build.SourceBranch)
        CheckoutFolders: ${{ convertToJson(parameters.checkoutFolders) }}
        System_AccessToken: $(System.AccessToken)
        Repository_Path: $(Build.Repository.LocalPath)
Joshua Enfield
  • 17,642
  • 10
  • 51
  • 98