4

I started Git yesterday and before that I was using SVN since many years. Let me explain you what exactly I am trying to achieve:

I have a main git repo for different projects. Something like this:

  • main_repo/proj1
  • main_repo/proj2
  • main_repo/proj3

I can't create separate repos for these sub-projects. Now I (or other users) should be able to checkout/commit/push/pull in these independent projects.

For this thing I am trying sparse-checkout with following commands:

  1. mkdir proj1; cd proj1
  2. git init
  3. git remote add origin https://main_repo
  4. git config core.sparsecheckout true
  5. echo "proj1/" > .git/info/sparse-checkout
  6. git pull origin master

Now what I need is to get all the files present in proj1 in the same current dir. What I am getting is something like this:

/proj1/proj1/files_and_dirs_in_proj1

What I need:

/proj1/files_and_dirs_in_proj1

Second this is that the checkout-dir doesn't behaves like a git repo. It doesn't carry any .git dir. So I don't understand how to do commits/push/pull in sparse-checkedout projs.

I hope I explained it well. Please suggest.

alexander.polomodov
  • 5,396
  • 14
  • 39
  • 46
user2793078
  • 419
  • 7
  • 12
  • One repository for multiple independent projects sounds like a design mistake. Care to elaborate why you can't split them up? – wonce Mar 10 '16 at 23:36
  • @wonce he didn't say they were independent. Given that they are probably not independent, there are many advantages: https://medium.com/@maoberlehner/monorepos-in-the-wild-33c6eb246cb9 – cobberboy Aug 06 '18 at 09:03
  • @user2793078 : I believe a symlink is all you need. In statement 1, use a different name e.g. `allprojects` . Then add #7 : `cd .. && ln -s allprojects/proj1 proj1` – cobberboy Aug 06 '18 at 09:08

3 Answers3

2

If I have correctly understood you then you want to:

  • create git repository with 3 projects inside it (subfolders proj1, proj2, proj3)
  • create separate directories for each team working with its project (e.g. teamproj1, teamproj2, teamproj3)
  • these folders should have only files belonging to appropriate project (e.g. teamproj1 folder have only proj1 subfolder, etc)

In this case you can use this simple steps:

1) Add your main repo to git

git init
git remote add origin ...
git add .
git commit -m 'Initial commit'
git push

2) Clone & adjust your git repos for each commands (teamproj1, teamproj2, teamproj3, ...). Repeat code below for each team/project

git clone ... teamproj1
cd teamproj1
git config core.sparsecheckout true
echo 'proj1' › .git/info/sparse-checkout
git read-tree -m -u HEAD

3) Bingo. Each command will be with its own folder and project. This folder is normal git repo, but this repo will show only some files which will be in .git/info/sparse-checkout

I've added some drawings to facilitate understanding of this scheme

enter image description here

alexander.polomodov
  • 5,396
  • 14
  • 39
  • 46
  • first of all thanks for your effort... Now my quest, if I do git clone ... teamproj1, it will clone the whole main repo... which is huge and that what I really dont want. – user2793078 Mar 11 '16 at 00:11
  • this is the problem for sparse checkout in git `Unfortunately it does not affect the size of the overall local repository but can be helpful if you have a huge tree of folders`. You checkout all repo, but git shows you only subset if you enabled `core.sparsecheckout` feature. See: http://blogs.atlassian.com/2014/05/handle-big-repositories-git/ – alexander.polomodov Mar 11 '16 at 00:17
  • thats unbelievable. So it means there is nothing really like a sub dir check-out like svn. What you mentioned sounds like 'checkout everything and then hide the unwanted stuff' – user2793078 Mar 11 '16 at 01:09
  • It's not appropriate using of git. Why you need to use one repository for all projects? Maybe you can put each project to separate repo? – alexander.polomodov Mar 11 '16 at 01:11
1

The recent git sparse-checkout I mention here can help, especially when combined with git clone --filter=blob:none --no-checkout

git clone --filter=blob:none --no-checkout https://github.com/<me>/<myrepo>
git config core.sparseCheckoutCone false
git sparse-checkout disable

# Add the expected pattern, to include just a subfolder without top files:
git sparse-checkout set /mySubFolder/

# populate working-tree with only the right files:
git read-tree -mu HEAD

Plus, with Git 2.32 (Q2 2021), "git add"(man) and git rm(man) learned not to touch those paths that are outside of sparse checkout.

So you won't make any mistake with paths outside of the relevant submodules.

See commit d5f4b82, commit a20f704, commit b243012, commit 719630e, commit d73dbaf, commit 6594afc, commit 4e95698 (08 Apr 2021) by Matheus Tavares (matheustavares).
(Merged by Junio C Hamano -- gitster -- in commit fe069dc, 07 May 2021)

rm: honor sparse checkout patterns

Suggested-by: Elijah Newren
Signed-off-by: Matheus Tavares

git add(man) refrains from adding or updating index entries that are outside the current sparse checkout, but git rm(man) doesn't follow the same restriction.
This is somewhat counter-intuitive and inconsistent.
So make rm honor the sparsity rules and advise on how to remove SKIP_WORKTREE entries just like add does.
Also add some tests for the new behavior.

git config now includes in its man page:

Advice shown when either git add or git rm is asked to update index entries outside the current sparse checkout.

git rm now includes in its man page:

allowing the file to be removed from just the index. When sparse-checkouts are in use (see git sparse-checkout), git rm will only remove paths within the sparse-checkout patterns.


With Git 2.34 (Q4 2021), "git add"(man) can work better with the sparse index.

See commit 42f8ed6, commit 939fa07, commit 4eaffd8, commit 5e7cbab, commit 83ad8ca (29 Jul 2021) by Derrick Stolee (derrickstolee).
(Merged by Junio C Hamano -- gitster -- in commit 2f71366, 24 Aug 2021)

add: ignore outside the sparse-checkout in refresh()

Reviewed-by: Elijah Newren
Signed-off-by: Derrick Stolee

Since b243012 (refresh_index(): add flag to ignore SKIP_WORKTREE entries, 2021-04-08, Git v2.32.0-rc0 -- merge listed in batch #14) (refresh_index(): add flag to ignore SKIP_WORKTREE entries, 2021-04-08), 'git add --refresh'(man) <path> will output a warning message when the path is outside the sparse-checkout definition.
The implementation of this warning happened in parallel with the sparse-index work to add ensure_full_index() calls throughout the codebase.

Update this loop to have the proper logic that checks to see if the pathspec is outside the sparse-checkout definition.
This avoids the need to expand the sparse directory entry and determine if the path is tracked, untracked, or ignored.
We simply avoid updating the stat() information because there isn't even an entry that matches the path!

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
-2

First off, git is not svn, different concepts, different approaches. Git is a full source-control system on the client, not a front-end to a remote server

And one of the differences is you clone the whole repository and not subsections. I believe there are ways to make it appear that you've only cloned a subset but in reality the whole repo is there.

In the testing/usage I've done, it doesn't matter. Yes, perhaps the initial cloning is a little slower but after that it's really fast. My company has a large 4G+ source tree repository, 100s of developers, everyone has a complete copy of the repo even though their responsibility might be segmented, even remote over our mediocre corporate WAN there have been no problems.

Scott Sosna
  • 1,443
  • 1
  • 8
  • 8