0

I have noticed that subsequent checkouts on bare repository takes much less time than an initial one:

git --git-dir=/path/to/bare/repo --work-tree=/path/to/working/tree checkout <ref> .

That would be explainable if file read operations was very much faster than write ones, but it isn't so. Then I tried to touch all the working tree files recursively and noticed that git checkout then took as much time as it took initially. So it somehow relies on file modified timestamp.

Hence my question is where does git store working tree checkout data (or more precisely checkout timestamps) for bare (possibly readonly) repositories?

UPDATE

git clone https://github.com/torvalds/linux.git
mv linux/.git linux.git
rm -rf linux
cd linux.git
git config --bool core.bare true
cd ..
mkdir work-tree

time git --git-dir=linux.git --work-tree=work-tree checkout master .
Updated 72867 paths from bb1aa9635246
git --git-dir=linux.git --work-tree=work-tree checkout master .  6.10s user 12.39s system 100% cpu 18.339 total

time git --git-dir=linux.git --work-tree=work-tree checkout master .
Updated 26 paths from bb1aa9635246
git --git-dir=linux.git --work-tree=work-tree checkout master .  0.23s user 0.92s system 181% cpu 0.635 total

time git --git-dir=linux.git --work-tree=work-tree checkout v5.13 . 
Updated 11126 paths from e07e8b0138ae
git --git-dir=linux.git --work-tree=work-tree checkout v5.13 .  1.92s user 3.96s system 105% cpu 5.548 total

time git --git-dir=linux.git --work-tree=work-tree checkout v5.13 .
Updated 26 paths from e07e8b0138ae
git --git-dir=linux.git --work-tree=work-tree checkout v5.13 .  0.26s user 0.93s system 175% cpu 0.681 total

time git --git-dir=linux.git --work-tree=work-tree checkout master .                                     
Updated 10634 paths from bb1aa9635246
git --git-dir=linux.git --work-tree=work-tree checkout master .  1.87s user 4.02s system 104% cpu 5.614 total

time git --git-dir=linux.git --work-tree=work-tree checkout master .
Updated 26 paths from bb1aa9635246
git --git-dir=linux.git --work-tree=work-tree checkout master .  0.26s user 0.97s system 176% cpu 0.697 total

find work-tree -type f -exec touch {} +
time git --git-dir=linux.git --work-tree=work-tree checkout master .
Updated 72829 paths from bb1aa9635246
git --git-dir=linux.git --work-tree=work-tree checkout master .  6.10s user 18.67s system 99% cpu 25.010 total
ababo
  • 1,490
  • 1
  • 10
  • 24
  • "where does git store working tree checkout data" In the index (staging area)? No different than for a normal repository. – matt Jul 30 '21 at 07:42

1 Answers1

0

The short answer is that it doesn't store it ... correctly, that is. Well, it might. Or it might not! It all depends on how you use it.

Git always uses its index to do git checkout. When you're in a bare repository, which has no working tree, you can force Git to believe in a working tree path; Git then uses its default index unless you force Git to use some alternate index.

This has unfortunate side effects if you run, e.g.:

git checkout --work-tree=wt1 <commit1>
git checkout --work-tree=wt2 <commit2>

back and forth with various commit hash IDs over time, because the (single) index is now assumed to correctly describe both working trees.

This is clearly not possible. The result is messed-up working trees.

The trick to make this work is to use one index per working tree. If you only ever use one working tree path, the one index does actually work fine. If you use more than one, it stops working.

(Your best bet is not to use a bare repository like this at all. Sometimes there are reasons you want to, though. The second-best-bet is to be a Git expert and know all these tricky things, perhaps, but there's a problem with this technique: Git is actively evolving. For instance, git worktree add actually adds <index, HEAD, working tree> tuples, more or less. That's another way to go here, but be aware that there are tricky corner cases here.)

torek
  • 448,244
  • 59
  • 642
  • 775