I would like to fetch only the commits of branchA
not present in its base branchB
.
For example, consider this history:
B1 - B2 - B3 - B4 - B5
\
A1 - A2 - A3
I would like to fetch only A1
, A2
and A3
.
It's important to note that I don't know up front which commit is A1
, and how many commits I need to fetch.
My input is just the heads of the two branches,
in this example branchA=A3
and branchB=B5
.
Based on such input I need to identify A1
and fetch everything between A1
and branchA
, and ideally nothing more.
Alternatively, fetching a minimal set of commits that include A1
, A2
and A3
, and enough information to identify A1
, can be interesting too.
Why? In a use case where I only need those commits ("what changed in branchA
relative to branchB
), fetching more than the necessary commits slows down my process. Take for example a large repository with thousands of commits, and feature branches with only a few commits. Fetching the entire history of branchA
and branchB
fetches a lot of commits I don't need, and takes a lot of time and network bandwidth.
I came up with an ugly hack that avoids fetching the full history, by starting from shallow clones, and incrementally fetching more and more until a common commit is found:
git clone --depth 1 "$repo" --branch "$branchA" shallow
cd shallow
for ((depth = 8; depth <= 1024; depth *= 2)); do
echo "trying depth $depth ..."
git fetch --depth $depth
git fetch --depth $depth origin "$branchB:$branchB"
lastrev=$(git rev-list --reverse "$branchB" | head -n1)
if git merge-base --is-ancestor "$lastrev" HEAD; then
echo "found with depth=$depth"
break
fi
done
This works for my use case: it fetches a large enough subset of commits to identify A1
and include commits until the head of branchA
, and it's faster than fetching the complete history of the two branches.
Is there a better way than this? I'm looking for a pure Git solution, but if the GitHub API has something to make this faster and easier, that can be interesting too.