5

for example, I want to get this folder https://github.com/python/cpython/tree/2.7/Tools/freeze

The command I ran was:

mkdir python
cd python
git init
git remote add origin https://github.com/python/cpython.git
git config core.sparsecheckout true
echo "Tools/freeze/*" >> .git/info/sparse-checkout

# find remote branches
git remote show origin

# this works and pulls only that folder
git pull --depth=1 origin master

# but this doesn't, why?
git pull --depth=1 origin 2.7

# but how do I switch to remote 2.7 branch?
git checkout --track -b 2.7 origin/2.7
fatal: Cannot update paths and switch to branch '2.7' at the same time.
Did you intend to checkout 'origin/2.7' which can not be resolved as commit?

I read somewhere I need to run a git fetch before checkout, but it kind of defeat the purpose of sparse checkout, my internet is slow and repo is huge. How can I just get that subdirectory with branch 2.7 ? Thanks!

This is on windows8 and git bash

edit: if I ran git pull --depth=1 origin 2.7 it will pull remote 2.7 branch but it also brings every other files into my working directory, while if I ran git pull --depth=1 origin master, it only brings Tools/freeze directory in master branch ? why is this happening ?

another example:

mkdir qt
cd qt
git init
git remote add origin https://github.com/qtproject/qt.git
git config core.sparsecheckout true
echo util/qlalr/examples/lambda/* >> .git/info/sparse-checkout
git pull --depth=1 origin 4.8

That folder util/qlalr/examples/lambda is very small, but when it run the last command, it is still slow, can this be avoided ?

edit2: I realized that this is not possible with current git. but my only left question now is why git pull --depth=1 origin 2.7 doesn't respect sparse checkout config?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Shuman
  • 3,914
  • 8
  • 42
  • 65

5 Answers5

4

You have to create a local branch for reference. Updated steps should be:

git init <repo>
cd <repo>
git remote add origin <url>
git config core.sparsecheckout true
echo "finisht/*" >> .git/info/sparse-checkout
git branch -b <your branch>
git pull --depth=1 origin <your branch>
1

Your checkout failed because pulling (and hence fetching) an explicit ref fetches only that ref, so after your initial pull your repo had only refs/heads/master and refs/remotes/origin/master, both pointing at the same commit. Checkout of 2.7 didn't work because your repo didn't have anything by that name.

Pull does a merge, and the extra content git pull origin 2.7 put in your worktree is there for conflict resolution, merge can't determine the correct results so you have to. You'll see that not everything outside the Tools directory is checked out, only the conflicted files. I'm not sure how merge with a shallow fetch and sparse checkout should behave overall, but asking for conflict resolution is surely the only thing to do here.

Doing a shallow one-ref fetch is as lightweight as git gets, if one-off bandwidth use is really that dear you could clone to an ec2 instance and tag a particular tree.

jthill
  • 55,082
  • 5
  • 77
  • 137
0

Try this

mkdir 
cd 
git init
git remote add -f origin <url>

This creates an empty repository and fetches all objects but doesn't check them out. Then do:

git config core.sparseCheckout true

Now define which folders you want. This is done adding it in .git/info/sparse-checkout,

echo "some/dir/" >> .git/info/sparse-checkout
echo "another/sub/tree" >> .git/info/sparse-checkout

Then

git pull origin master
Guillaume Jacquenot
  • 11,217
  • 6
  • 43
  • 49
Tejus Prasad
  • 6,322
  • 7
  • 47
  • 75
  • Thanks, but the whole idea is to not use -f "fetch"? because fetch is very slow on my low speed internet. it still get all the data I don't need right? – Shuman Mar 05 '16 at 22:32
0

First of all set the config parameter:

# Enable sparse-checkout:
git config core.sparsecheckout true

Configure sparse-checkout paths in .git/info/sparse-checkout:

# Add the relevant path to the sparse-checkout file
echo cpython/tree/2.7/Tools/freeze >> .git/info/sparse-checkout

Update your working tree:

git read-tree -mu HEAD

git-read-tree
Reads tree information into the index

-m
Perform a merge, not just a read

-u
After a successful merge, update the files in the work tree with the result of the merge.


sparse checkout

With sparse checkout you basically tell Git to exclude a certain set of files from the working tree. Those files will still be part of the repository but they won't show up in your working directory.

Internally, sparse checkout uses the skip-worktree flag to mark all the excluded files as always updated.

# enable sparse checkout in an existing repository:
git config core.sparseCheckout true

# Create a .git/info/sparse-checkout file containing the
# paths to include/exclude from your working directory. 

# Update your working directory with 
git read-tree -mu HEAD

enter image description here

CodeWizard
  • 128,036
  • 21
  • 144
  • 167
  • are you saying I should run this? Just tried not working. can you give the full command ? http://pastebin.com/W4yiJhwe I see what you mean, you must have copied from here:http://blogs.atlassian.com/2014/05/handle-big-repositories-git/ but the whole idea is to not clone at first, I have mentioned that I want to avoid clone or fetch the whole repo. you can try your suggestion on a big repo such as qt, it still get everything, pick any small folder, it should finish in a few seconds right? not 10-20 min. – Shuman Mar 05 '16 at 23:04
  • I don't have a clone of the repo in the first place. – Shuman Mar 05 '16 at 23:10
  • HI, i did not copy the solution from this link. But i did run it on a full cloned (fetch) repo. and yep it shouldn't take 10-20 minutes to run, – CodeWizard Mar 05 '16 at 23:12
  • thanks! I want to avoid the full clone(fetch) , there must be a way right? The commands I posted in my question already do that, the only problem is that it is getting `master` branch only, I don't know why when I replace `git pull --depth=1 origin master` with `git pull --depth=1 origin 2.7`, it pulls every files in other folders, it seems as if sparse checkout configuration is not working when I pull a branch other than `master` – Shuman Mar 05 '16 at 23:15
  • @Shuman Take a look at the content you are putting into `.git/info/sparse-checkout`. It starts with `cpython/tree/2.7/......` which means you are already pulling from branch `2.7`. Now combined that with `--depth=1`. It should work. – Khurshid Alam May 27 '16 at 04:03
0

I want to download particular folder from a branch without downloading the whole history of Master repository as that repo has huge amount of history and number of branches

mkdir testFolder
cd testFolder
git init
git remote add origin <URL>

Below git command will fetch branches starting with f_2, example: f_23, f_24 etc.

git config remote.origin.fetch +refs/heads/f_2*:refs/remotes/origin/f_2*
git fetch --depth=1

Set the Folder name which you want to checkout.

git sparse-checkout set <folderName>

Below command will download that particular folder from f_23 branch

git checkout f_23
  • @Jean-François Fabre I removed my answer from another [question](https://stackoverflow.com/questions/4114887/is-it-possible-to-do-a-sparse-checkout-without-checking-out-the-whole-repository) and posted here , As my solution seems more related to this question. – harshini gulipalli Apr 10 '22 at 04:55