The git clone --depth=1 ...
suggested in 2014 will become faster in Q2 2019 with Git 2.22.
That is because, during an initial "git clone --depth=...
" partial clone, it is
pointless to spend cycles for a large portion of the connectivity
check that enumerates and skips promisor objects (which by definition is all objects fetched from the other side).
This has been optimized out.
clone
: do faster object check for partial clones
For partial clones, doing a full connectivity check is wasteful; we skip
promisor objects (which, for a partial clone, is all known objects), and
enumerating them all to exclude them from the connectivity check can
take a significant amount of time on large repos.
At most, we want to make sure that we get the objects referred to by any
wanted refs.
For partial clones, just check that these objects were transferred.
Result:
Test dfa33a2^ dfa33a2
-------------------------------------------------------------------------
5600.2: clone without blobs 18.41(22.72+1.09) 6.83(11.65+0.50) -62.9%
5600.3: checkout of result 1.82(3.24+0.26) 1.84(3.24+0.26) +1.1%
62% faster!
With Git 2.26 (Q1 2020), an unneeded connectivity check is now disabled in a partial clone when fetching into it.
See commit 2df1aa2, commit 5003377 (12 Jan 2020) by Jonathan Tan (jhowtan
).
(Merged by Junio C Hamano -- gitster
-- in commit 8fb3945, 14 Feb 2020)
connected
: verify promisor-ness of partial clone
Signed-off-by: Jonathan Tan
Reviewed-by: Jonathan Nieder
Commit dfa33a298d ("clone
: do faster object check for partial clones", 2019-04-21, Git v2.22.0-rc0 -- merge) optimized the connectivity check done when cloning with --filter
to check only the existence of objects directly pointed to by refs.
But this is not sufficient: they also need to be promisor objects.
Make this check more robust by instead checking that these objects are promisor objects, that is, they appear in a promisor pack.
And:
fetch
: forgo full connectivity check if --filter
Signed-off-by: Jonathan Tan
Reviewed-by: Jonathan Nieder
If a filter is specified, we do not need a full connectivity check on the contents of the packfile we just fetched; we only need to check that the objects referenced are promisor objects.
This significantly speeds up fetches into repositories that have many promisor objects, because during the connectivity check, all promisor objects are enumerated (to mark them UNINTERESTING), and that takes a significant amount of time.
And, still with Git 2.26 (Q1 2020), The object reachability bitmap machinery and the partial cloning machinery were not prepared to work well together, because some object-filtering criteria that partial clones use inherently rely on object traversal, but the bitmap machinery is an optimization to bypass that object traversal.
There however are some cases where they can work together, and they were taught about them.
See commit 20a5fd8 (18 Feb 2020) by Junio C Hamano (gitster
).
See commit 3ab3185, commit 84243da, commit 4f3bd56, commit cc4aa28, commit 2aaeb9a, commit 6663ae0, commit 4eb707e, commit ea047a8, commit 608d9c9, commit 55cb10f, commit 792f811, commit d90fe06 (14 Feb 2020), and commit e03f928, commit acac50d, commit 551cf8b (13 Feb 2020) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit 0df82d9, 02 Mar 2020)
pack-bitmap
: implement BLOB_LIMIT
filtering
Signed-off-by: Jeff King
Just as the previous commit implemented BLOB_NONE
, we can support BLOB_LIMIT
filters by looking at the sizes of any blobs in the result and unsetting their bits as appropriate.
This is slightly more expensive than BLOB_NONE,
but still produces a noticeable speedup (these results are on git.git):
Test HEAD~2 HEAD
------------------------------------------------------------------------------------
5310.9: rev-list count with blob:none 1.80(1.77+0.02) 0.22(0.20+0.02) -87.8%
5310.10: rev-list count with blob:limit=1k 1.99(1.96+0.03) 0.29(0.25+0.03) -85.4%
The implementation is similar to the BLOB_NONE
one, with the exception that we have to go object-by-object while walking the blob-type bitmap (since we can't mask out the matches, but must look up the size individually for each blob).
The trick with using ctz64()
is taken from show_objects_for_type()
, which likewise needs to find individual bits (but wants to quickly skip over big chunks without blobs).
Git 2.27 (Q2 2020) will simplify the commit ancestry connectedness check in a partial clone repository in which "promised" objects are assumed to be obtainable lazily on-demand from promisor remote repositories.
See commit 2b98478 (20 Mar 2020) by Jonathan Tan (jhowtan
).
(Merged by Junio C Hamano -- gitster
-- in commit 0c60105, 22 Apr 2020)
connected
: always use partial clone optimization
Signed-off-by: Jonathan Tan
Reviewed-by: Josh Steadmon
With 50033772d5 ("connected
: verify promisor-ness of partial clone", 2020-01-30, Git v2.26.0-rc0 -- merge listed in batch #5), the fast path (checking promisor packs) in check_connected()
now passes a subset of the slow path (rev-list) > - if all objects to be checked are found in promisor packs, both the fast path and the slow path will pass;
- otherwise, the fast path will definitely not pass.
This means that we can always attempt the fast path whenever we need to do the slow path.
The fast path is currently guarded by a flag; therefore, remove that flag.
Also, make the fast path fallback to the slow path - if the fast path fails, the failing OID and all remaining OIDs will be passed to rev-list.
The main user-visible benefit is the performance of fetch from a partial clone - specifically, the speedup of the connectivity check done before the fetch.
In particular, a no-op fetch into a partial clone on my computer was sped up from 7 seconds to 0.01 seconds. This is a complement to the work in 2df1aa239c ("fetch
: forgo full connectivity check if --filter", 2020-01-30, Git v2.26.0-rc0 -- merge listed in batch #5), which is the child of the aforementioned 50033772d5. In that commit, the connectivity check after the fetch was sped up.
The addition of the fast path might cause performance reductions in these cases:
If a partial clone or a fetch into a partial clone fails, Git will fruitlessly run rev-list
(it is expected that everything fetched would go into promisor packs, so if that didn't happen, it is most likely that rev-list will fail too).
Any connectivity checks done by receive-pack, in the (in my opinion, unlikely) event that a partial clone serves receive-pack.
I think that these cases are rare enough, and the performance reduction in this case minor enough (additional object DB access), that the benefit of avoiding a flag outweighs these.
With Git 2.27 (Q2 2020), the object walk with object filter "--filter=tree:0
" can now take advantage of the pack bitmap when available.
See commit 9639474, commit 5bf7f1e (04 May 2020) by Jeff King (peff
).
See commit b0a8d48, commit 856e12c (04 May 2020) by Taylor Blau (ttaylorr
).
(Merged by Junio C Hamano -- gitster
-- in commit 69ae8ff, 13 May 2020)
pack-bitmap.c
: support 'tree:0' filtering
Signed-off-by: Taylor Blau
In the previous patch, we made it easy to define other filters that exclude all objects of a certain type. Use that in order to implement bitmap-level filtering for the '--filter=tree:<n>
' filter when 'n
' is equal to 0
.
The general case is not helped by bitmaps, since for values of 'n > 0
', the object filtering machinery requires a full-blown tree traversal in order to determine the depth of a given tree.
Caching this is non-obvious, too, since the same tree object can have a different depth depending on the context (e.g., a tree was moved up in the directory hierarchy between two commits).
But, the 'n = 0
' case can be helped, and this patch does so.
Running p5310.11
in this tree and on master with the kernel, we can see that this case is helped substantially:
Test master this tree
--------------------------------------------------------------------------------
5310.11: rev-list count with tree:0 10.68(10.39+0.27) 0.06(0.04+0.01) -99.4%
And:
See commit 9639474, commit 5bf7f1e (04 May 2020) by Jeff King (peff
).
See commit b0a8d48, commit 856e12c (04 May 2020) by Taylor Blau (ttaylorr
).
(Merged by Junio C Hamano -- gitster
-- in commit 69ae8ff, 13 May 2020)
pack-bitmap
: pass object filter to fill-in traversal
Signed-off-by: Jeff King
Signed-off-by: Taylor Blau
Sometimes a bitmap traversal still has to walk some commits manually, because those commits aren't included in the bitmap packfile (e.g., due to a push or commit since the last full repack).
If we're given an object filter, we don't pass it down to this traversal.
It's not necessary for correctness because the bitmap code has its own filters to post-process the bitmap result (which it must, to filter out the objects that are mentioned in the bitmapped packfile).
And with blob filters, there was no performance reason to pass along those filters, either. The fill-in traversal could omit them from the result, but it wouldn't save us any time to do so, since we'd still have to walk each tree entry to see if it's a blob or not.
But now that we support tree filters, there's opportunity for savings. A tree:depth=0
filter means we can avoid accessing trees entirely, since we know we won't them (or any of the subtrees or blobs they point to).
The new test in p5310
shows this off (the "partial bitmap" state is one where HEAD~100
and its ancestors are all in a bitmapped pack, but HEAD~100..HEAD
are not).
Here are the results (run against linux.git
):
Test HEAD^ HEAD
-------------------------------------------------------------------------------------------------
[...]
5310.16: rev-list with tree filter (partial bitmap) 0.19(0.17+0.02) 0.03(0.02+0.01) -84.2%
The absolute number of savings isn't huge, but keep in mind that we only omitted 100 first-parent links (in the version of linux.git
here, that's 894 actual commits).
In a more pathological case, we might have a much larger proportion of non-bitmapped commits. I didn't bother creating such a case in the perf script because the setup is expensive, and this is plenty to show the savings as a percentage.
With Git 2.32 (Q2 2021), handling of "promisor packs" that allows certain objects to be missing and lazily retrievable has been optimized (a bit).
See commit c1fa951, commit 45a187c, commit fcc07e9 (13 Apr 2021) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit 13158b9, 30 Apr 2021)
revision
: avoid parsing with --exclude-promisor-objects
Signed-off-by: Jeff King
When --exclude-promisor-objects
is given, before traversing any objects we iterate over all of the objects in any promisor packs, marking them as UNINTERESTING and SEEN.
We turn the oid we get from iterating the pack into an object with parse_object()
, but this has two problems:
- it's slow; we are zlib inflating (and reconstructing from deltas) every byte of every object in the packfile
- it leaves the tree buffers attached to their structs, which means our heap usage will grow to store every uncompressed tree simultaneously.
This can be gigabytes.
We can obviously fix the second by freeing the tree buffers after we've parsed them.
But we can observe that the function doesn't look at the object contents at all! The only reason we call parse_object()
is that we need a "struct object
" on which to set the flags.
There are two options here:
- we can look up just the object type via
oid_object_info()
, and then call the appropriate lookup_foo()
function
- we can call
lookup_unknown_object()
, which gives us an OBJ_NONE
struct (which will get auto-converted later by object_as_type()
via calls to lookup_commit()
, etc).
The first one is closer to the current code, but we do pay the price to look up the type for each object.
The latter should be more efficient in CPU, though it wastes a little bit of memory (the "unknown" object structs are a union of all object types, so some of the structs are bigger than they need to be).
It also runs the risk of triggering a latent bug in code that calls lookup_object()
directly but isn't ready to handle OBJ_NONE
(such code would already be buggy, but we use lookup_unknown_object()
infrequently enough that it might be hiding).
I went with the second option here.
I don't think the risk is high (and we'd want to find and fix any such bugs anyway), and it should be more efficient overall.
The new tests in p5600 show off the improvement (this is on git.git):
Test HEAD^ HEAD
-------------------------------------------------------------------------------
5600.5: count commits 0.37(0.37+0.00) 0.38(0.38+0.00) +2.7%
5600.6: count non-promisor commits 11.74(11.37+0.37) 0.04(0.03+0.00) -99.7%
The improvement is particularly big in this script because every object in the newly-cloned partial repo is a promisor object.
So after marking them all, there's nothing left to traverse.
An earlier optimization discarded a tree-object buffer that is still in use, which has been corrected with Git 2.38 (Q3 2022).
See commit 1490d7d (14 Aug 2022) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit 01a30a5, 25 Aug 2022)
Reported-by: Andrew Olsen
Signed-off-by: Jeff King
Since commit fcc07e9 (is_promisor_object()
: free tree buffer after parsing, 2021-04-13, Git v2.32.0-rc0 -- merge listed in batch #13) (is_promisor_object()
: free tree buffer after parsing, 2021-04-13), we'll always free the buffers attached to a "struct tree" after searching them for promisor links.
But there's an important case where we don't want to do so: if somebody else is already using the tree!
This can happen during a "rev-list --missing=allow-promisor
" traversal in a partial clone that is missing one or more trees or blobs.
Even a trivial use of "--missing=allow-promisor
" triggers this problem, as the included test demonstrates (it's just a vanilla --blob:none clone
).