git commits on a specific branch

Question

I'm looking for the commits of a specific branch. My tree looks as follows:

feature/X     C--E--F
             /       \
master  -A--B--D---G--H--I--J->

How to get the commits C,E and F? What I tried is:

git rev-list feature/X ^master

but this gives no commits. I assume in that special case the problem is the back merge of feature/X to the master. That's why the commits C, E and F are accessible from master too, isn't it? So - how to handle that situation, Any ideas?

Regards Thomas

Cool, thanks - that works - but how to identify (automatically) B? To figure this out I have to check which commits of feature/X are accessible an any other reference, isn't it? And - as far as I understand C, E and F are accessible using the master. — user3592527, Sep 27 '16 at 08:19

Noufal Ibrahim · Answer 1 · 2016-09-27T08:36:51.590

Generally speaking, once it's merged, the details of which branch the commit came from is lost. All you have is that the commits are covered by the current branch.

However, I can sort of find a way like so. First, you find where the merge took place. This can be done using a git log --merges -1 to find the nearest merge to master (in your case, H) . The featureX branch I assume is right behind this at F. This commit has 2 parents. Since featureX was merged into master, you can get the target branch parent using H^.

Then, you can find the difference between H^ and F like so, git log H^..featureX which should give you all the commits reachable from featureX and omits those reachable from H^ ie. C, E and F.

As an example, here is a repo.

If done, right, I should get all the "Update X" commits.

First, we get the merge commit.

% git log --merges -1 --oneline
e8928b9 Merge branch 'X'

Then we get the log in question. The feature branch is called X in my repo.

% git log e8928\^..X --oneline
92a1f58 Updates x 10
f56306d Updates x 9
54d2253 Updates x 8
a8ba58b Updates x 7
10d08c5 Updates x 6
625d267 Updates x 5
96671d4 Updates x 4
5031498 Updates x 3
41770ea Updates x 2
442033b Updates x 1

This is, at best, hackish. I'd be very interested in finding a genuine solution.

qzb · Answer 2 · 2016-09-27T09:18:58.277

1

There is a little bit hackish, but working one-liner:

git rev-list "$(git rev-list feature/X..master | tail -n 1)^"..feature/X

edited Sep 27 '16 at 09:18

answered Sep 27 '16 at 08:28

qzb

8,163
3
21
27

Thanks - looks if this is my solution. Just for my understanding: What exact does the ^ behind the sha? – user3592527 Sep 27 '16 at 09:08
@user3592527 `sha^` means "first parent of `sha`" – qzb Sep 27 '16 at 09:17
@rudimeier I've changed answer to your version. – qzb Sep 27 '16 at 09:20

score 1 · Answer 3 · edited May 23 '17 at 10:30

It may help to re-draw this:

feature/X     C--E--F
             /       \
master  -A--B--D---G--H--I--J->

as this:

        C--E--F           <-- feature/X
       /       \
<--A--B--D---G--H--I--J   <-- master

The reason is that the arrows really do point backwards, with feature/X pointing to the tip commit of branch feature/X, i.e., to commit F, and master pointing to the tip commit of master (which I've assumed is J here, though maybe there are more given your original drawing).

As you've noted, feature/X ^master (which can also be spelled master..feature/X) fails because commit F is reachable from master by starting at the commit to which master points (J) and working backwards. When we hit commit H we work backwards through both parents simultaneously, so the request to eliminate all commits reachable from master also eliminates the C--E--F sequence.

To stop that from happening, we must eliminate commits starting from some point before H, i.e., a point before the first merge that brings the tip of feature/X into master. Any of commits G, D, or B will suffice. That is, if we had the hash of any one of these commits, then:

git rev-list feature/X ^$hash

would do the trick.

qzb's method finds commit D and then uses a suffix ^ to identify its first and only parent. It works by listing every commit reachable from J (the tip of master) that is not also reachable from F (the tip of feature/X). There is a caveat: git rev-list may sort commits, so that D may not actually be listed last, but the | tail -1 assumes that the listing ends with commit D's hash.

This therefore depends on the date-stamps stored in the commits. If they were made in order (so that the dates all increase as the commits move forward in time), that's not a problem. Usually they do. But sometimes you can add commits in the "wrong" date order, due to clocks being set incorrectly, or commits being done on different computers that disagree as to what time it is, or whatever.

We can fix the date assumption by telling git rev-list to use --topo-order, which forces it to list commits in graph order (using a partial order from the graph topology). So when using this method, add --topo-order.

Noufal Ibrahim's method works by finding commit H instead, using git log. It's a bit better to use git rev-list, which takes the same options as git log but just prints the hash (which is all we want):

H=$(git rev-list --merges -1 master)
# H stands for Hash, and also for "commit H" :-)

(note that we must specify a starting point for the graph walk, while git log defaults to starting from HEAD). Obtaining the hash for commit H is not quite sufficient since we must then climb one parent back. Since H has two parents, we must carefully climb from H to G (not to F).

Fortunately, whenever we merge with git merge, Git makes sure that the first parent of the new merge commit is the commit that was on the current branch. That is, when we made commit H by running git merge feature/X, we were on branch master and the name master meant commit G. So the first parent of H is G, hence $H^1, or just $H^, identifies commit G:

H=$(git rev-list --merges -1 master)
git rev-list feature/X ^${H}^

The curly braces around H are not technically necessary, just meant for clarity: we expand $H and then put ^ after the expansion (to identify commit G), and another ^ in front of the expansion (to tell git rev-list that we're using this as an exclusion specifier).

Since $yes ^$no can be written as $no..$yes instead, we can also write this as:

H=$(git rev-list --merges -1 master)
git rev-list ^${H}^..feature/X

This method is a bit more efficient (we enumerate just the one commit H, rather than using tail -1 to get the last commit of some potentially long chain) and does not suffer from date-order issues (but we saw above how to fix those with --topo-order).

Incidentally, this too really should use --topo-order when finding commit H, for the same reason: we don't want Git to sort and put some other merge (something before A, for instance) in front of H.

The remaining flaws

qzb noted one of them: while feature/X points to commit F, if there are more merges in the past, we don't necessarily "know where to stop". That is:

          o--o---o--o--o     <-- feature
         /    \ /       \
...--o--o--o---o--o--o---o   <-- master

By drawing this particular graph in this particular way, it's clear to us that all the commits along the "top line" are those that were done on feature, and that feature was merged into master twice, while master was merged back into feature once. (Incidentally this sort of "cross merging" can get you into trouble. It's not wrong, but in general you should be careful about merging A into B and B into A. In some cases this produces multiple merge bases for merges, which can be tricky.) But it's not clear to Git, and there are other ways to draw the graph that will obscure it from our own eyes as well. (Moreover, if you ever allow "fast forward" merges (rather than a non-fast-forward, actual merge commit, merges), untangling branch history becomes impossible in general. Again, it's not wrong, you just need to be prepared to deal with it.)

A more important issue occurs with both methods if there is a merge on master past commit H. That is, suppose that the lettered graph we've been drawing so far is still a bit misleading, and in fact it should look like this:

        C--E--F           <-- feature/X
       /       \
<--A--B--D---G--H--I--J   <-- master
                  /
         <-o--o--o        <-- feature/Y

Now if we do:

H=$(git rev-list --topo-order --merges -1 master)

we will wind up setting $H to point to commit I, rather than commit H. The reason is simple: we asked for the most recent (topologically) commit starting from master and working backwards, that is also a merge commit. That's commit I. But I^ is commit H and excluding H will make the subsequent git rev-list exclude commits C--E--F.

That seems to doom this approach; can we go back to locating commit D? No, because qzb's trick:

$(git rev-list feature/X..master | tail -n 1)

stops working when Git races down the second parent of I, i.e., through feature/Y, and begins listing all those commits. Without --topo-order, we get the oldest commit. With --topo-order we are still not told which chain (I^1 vs I^2) is handled first. If that chain connects back at commit A or earlier, we may get the hash for commit A-or-earlier, instead of that for commit D.

We could fix that by noting the additional merge I that brings in feature/Y, and excluding feature/Y so that Git does not race down that chain. But this begins to get complicated. What we really need, then, is not the most recent merge, but rather the merge that brings in commit F (i.e., "find me commit H"). Is there a way to get that? As it turns out, there is. What we want here is --ancestry-path.

The --ancestry-path option strips out commits that are not descendants of an excluded commit. Since feature/X is merged into master, we know for certain that there is some commit (actually H, of course) after F that is a descendant of F—i.e., F is one of its parents—and also is an ancestor of master. So:

git rev-list --ancestry-path --topo-order ^feature/X master

tells Git to list out commits H, I, and J, and no other commits. That is, we won't go racing down the other commits brought in by merge I: those commits will get pruned.

If we then discard all but the last commit (with tail -1 again), and optionally speed things up a bit with --merges to discard any non-merges even before using tail, that will let us locate commit H even if I or J is a merge:

H=$(git rev-list --ancestry-path --topo-order \
    --merges ^feature/X master | tail -1)
git rev-list ^$H^..feature/X

This is a hybrid of the two methods: we use --ancestry-path to find commits starting from H, and tail -1 to drop all but commit H, then use ^$H^ to exclude commit-G-and-earlier.

git commits on a specific branch

3 Answers3

The remaining flaws

Linked