73

With the new sparse checkout feature in Git 1.7.0, is it possible to just get the contents of a subdirectory like how you can in SVN? I found this example, but it preserves the full directory structure. Imagine that I just wanted the contents of the 'perl' directory, without an actual directory named 'perl'.

-- EDIT --

Example:

My git repository contains the following paths

repo/.git/
repo/perl/
repo/perl/script1.pl
repo/perl/script2.pl
repo/images/
repo/images/image1.jpg
repo/images/image2.jpg
repo/doc/
repo/doc/readme.txt
repo/doc/help.txt

What I want is to be able to produce from the above repository this layout:

repo/.git/
repo/script1.pl
repo/script2.pl

However with the current sparse checkout feature, it seems like it is only possible to get

repo/.git/
repo/perl/script1.pl
repo/perl/script2.pl

which is NOT what I want.

Gabriel Devillers
  • 3,155
  • 2
  • 30
  • 53
davr
  • 18,877
  • 17
  • 76
  • 99
  • 4
    they finally implemented it! cool! – Mauricio Scheffer Feb 25 '10 at 18:42
  • Why? What is the problem? And why you want to have different directory structure in the repository and different locally? Does not make much sense at first glance. – Jiri Klouda Mar 09 '10 at 09:02
  • 2
    @Jiri: I have a web application with actionscript (client-side) and PHP (server-side) code. The files are closely related, so I want to put them in a single repo/branch. However I do not want the actionscript source files on the server, only the PHP files. – davr Mar 09 '10 at 21:02
  • 2
    @davr this isn't that rare a circumstance, I wanted the exact thing. Pity I can't get it yet. – preinheimer Oct 07 '11 at 05:00
  • @preinheimer, it's also something I'm trying to get. It would make developing and testing a theme I'm making a whole lot easier. – apokaliptis Nov 04 '14 at 23:04

7 Answers7

27

You still need to clone the whole repository, which will have all the files. You could use the --depth flag to only retrieve a limited amount of history.

Once the repository is cloned, the read-tree trick limits your "view" of the repository to only those files or directories that are in the .git/info/sparse-checkout file.

I wrote a quick script to help manage the sparseness, since at the moment it is a bit unfriendly:

#!/bin/sh
echo > .git/info/sparse-checkout
for i in "$@"
do
    echo "$i" >> .git/info/sparse-checkout
done
git read-tree -m -u HEAD

If you save this script as git-sparse.sh into the path reported by calling git --exec-path, then you can run git sparse foo/ bar/ to only "checkout" the foo and bar directories, or git sparse '*' to get everything back again.

kkahl
  • 415
  • 5
  • 17
richq
  • 55,548
  • 20
  • 150
  • 144
  • Thanks for the help, but that doesn't seem to answer my question. See my updated question for clarification. – davr Feb 26 '10 at 22:59
  • 3
    Yeah, sparse is just a way to filter the actual tree, it can't move files around. So you can't do what you want... – richq Feb 27 '10 at 11:07
15

The short answer is no. Git sees all files as a single unit.

What I recommend is that you break down you repositories into logical chunks. A separate one for perl, images, and docs. If you also needed to maintain the uber repo style you can create a repo made up of Submodules.

John K
  • 385
  • 1
  • 2
  • 11
6

Now without plunging into detail about why would you want to do this, your problem can be (probably) easily solved by a symlink/shortcut.

To answer the question - no, and with a meaningful reason. The whole history of the repo is downloaded even with a 'sparse checkout'. To clarify why this is necessary - otherwise tracking renamed files would be a pain in the ...neck. Imagine you move the file /repo_root/asd/file1.cpp to /repo_root/fgh/file1.cpp - now if you had only downloaded /repo_root/fgh deltas, you won't know about file1.cpp. So this means you must download all deltas. But then you have a full repository; not just a folder cut of it, therefore just the /rero_root/fgh folder is not a repo itself. This might not sound important when you checkout, but when you commit, git might not know enough to work alright.

Workaround: If you really want to, you can create a script that calls git-checkout in such a manner (for the sh shell, batch for windows should not be hard to produce):

!/bin/sh
curDir=`pwd`
cd $2
git-checkout $1
cp -R $3/* $4
cd $curDir

Here the first argument is the branch to checkout, the second - the folder where the repo is currently present, the third - the subdir you want to really use, and the fourth - the location to which you want it copied.

Warning: my shell skills are almost non-existent, so use this after testing. It should not be hard to recreate the reverse of this script, that copies back stuff, so that it can be committed to the repo.

Ger4ish
  • 137
  • 2
  • 5
  • Having the whole history of the repo isn't a problem, it's not a big repo, and we have plenty of diskspace. I guess our particular use case isn't that common, so git developers never thought to add it. It's one of the very few things that SVN worked better for us (git does 99 other things better, which is why we switched, but still) – davr Aug 07 '11 at 06:01
  • symlinks on windows are a nightmare - this is not tractable. – Alex Brown Mar 20 '13 at 17:50
6

richq's answer was close, but it missed a step. You need to explicitly enable sparse checkout:

git config core.sparsecheckout true

This blog post has all the steps outlined clearly:

http://blog.quilitz.de/2010/03/checkout-sub-directories-in-git-sparse-checkouts/comment-page-1/#comment-3146

PhilYoussef
  • 303
  • 3
  • 11
3

You can try braid - it tracks remotes while matching them to a path. https://github.com/evilchelu/braid/wiki

Antoine Toulme
  • 937
  • 4
  • 15
3

git filter-branch --subdirectory-filter is what you need, see Detach (move) subdirectory into separate Git repository.

Here is a little bash script to do that.

This will first make a working copy of the original repo, then filter branch using subdirectory filter to you get what you want.

#!/bin/bash
#
# git-subdir.sh
#
git clone --no-hardlinks $1 $2

cd $2

git filter-branch --subdirectory-filter $2 --prune-empty --tag-name-filter cat HEAD -- --all

git reset --hard

git remote rm origin

refbak=$(git for-each-ref --format="%(refname)" refs/original/)

if [ -n "$refbak" ];then
    echo -n $refbak | xargs -n 1 git update-ref -d
fi

git reflog expire --expire=now --all

git repack -ad

git gc --aggressive --prune=now

Use for the example in the question, git-subdir.sh repo perl would work.

Community
  • 1
  • 1
weynhamz
  • 1,968
  • 18
  • 18
0

It appears what you are trying to do is rename the directory tree such that your files end up in a different place. It appears to me that what you are asking to do is an anti-template for code/project management on two counts: categorization of modules (java bits under java node, perl under perl node), and having a project with files in different locations from where the developer visualizes them. Since git maintains hashes of directory contents to see what is changed, this also breaks git as such.

Daemeon Reiydelle