TL;DR
Use git branch --contains
with the hash IDs you find. But: why do you care? The hash ID is all you really need.
Long
There's a basic problem here with your diagram: it has no branch names on it. Let's put some branch names on it and then ask a key question:
/---D <-- br1
/
/ /---E <-- br2
/ /
A - B - C <-- br3
\
\
\--F <-- br4
Which branch is commit A
on?
Warning: this is a trick question! The answer is below, with (I hope) enough text in between so that you can't just cheat and read it, and will instead have to think about this. The obvious answer is "it's on br3
" but this isn't right. (It's not wrong, it's just not right.)
What you will want to do
I also read here about something about an --all
-flag ...
Use this flag, then use git describe
or git branch --contains
with the found commit hash IDs, or:
I also looked at the --source
-flag, but the results doesn't really make any sense to me.
The --source
flag does what the git log
documentation says it does:
--source
Print out the ref name given on the command line by which each commit was reached.
but, as is common, the reference manual is terse and laden with jargon here. The flag gets you some of the information you need, and sometimes it will be everything you need, but git branch --contains
or git describe
may still be more useful.
The answer to the trick question
Commit A
is on every branch.
The trick here is that in Git, many commits are on many branches simultaneously. Some commits may be on no branch. This gets us into a separate Git question, which is: What exactly do we mean by "branch"? The word branch in Git is actually ambiguous, and overused, sometimes to the point where it nearly loses all meaning. Once you get used to the crazy multiple meanings, though, it turns out that humans usually assign the right meaning automatically: a branch is a branch name, but it's also a remote-tracking name, a particular commit that Git calls more formally a tip commit, and a set of commits ending at the tip commit. A Git branch is all of these things, and yet, when a human says "branch", they usually mean only one of these things.
To make any sense out of this, we need the concept of reachability. Reachability is actually a graph-theory thing. The diagram you drew is a commit graph, with the letters A
through F
standing in for actual commits. Each actual commit has some unique, big and ugly and random-looking hash ID, but those are too hard for humans, so we mostly ignore them whenever we can, or use substitutes like these letters A
through F
here.
Each commit links backwards to a previous or parent commit. Here, commit C
links backwards to commit B
, which links backwards to commit A
. Commit D
links backwards to A
as well, and so does F
; E
links backwards to B
, which we already noted links backwards to A
.
By following the backwards-pointing links, Git finds the commits. Git finds the end commits—the branch tip commits—using the branch names, which are what humans tend to care about and use. But then Git works backwards from there.
When we start with, say, br1
, Git will find commit D
, then work backwards and find commit A
. This means commit A
is "on", or "contained in", branch br1
. But we can also start with br2
and find A
, and we can start with br3
and find A
, and so on. Indeed, since A
is our very first commit, all roads lead to Rome A
: commit A
is on every branch. It will be on future branches too.1
It is literally impossible, in Git, to know which branch a commit was created on unless you record that as text in the commit message. That's because we can create and destroy branch names at will: each branch name simply selects (or "points to") some commit in the commit graph. We pick this commit at the time we create the branch name.
Then, when we check out (switch to) the branch and make a new commit, Git makes the new commit such that it points backwards to the commit we had checked out, and stores the new commit's hash ID into the branch name so that the new commit is now the tip commit. So, given your diagram, if we git switch br3
and make a new commit, the name br3
will point to our new commit G
afterward; G
will point backwards to C
; and commit A
remains on every branch.
If we delete branch name br1
entirely, commit D
becomes un-findable, because we find the commits using branch names and working backwards. There's only one way to find D
right now, and that's to use br1
. So by deleting the name br1
, we "lose" commit D
. It becomes unreachable.2
So reachability means "how we get there". We get to commits from branch names. For much more on this concept, see Think Like (a) Git.
1It is possible, in Git, to create more than one root commit, and hence set up new branches that don't lead back to commit A
. But that's not very typical and we won't cover it here.
2Git will eventually discard an unreachable commit. You do, however, get a grace period to get the commit back, typically a minimum of 30 days. The problem is that you must find the commit's unique hash ID, which you would do using the branch name, but now that the branch name is gone... well, that's the dilemma.
Reachability, git branch --contains
, and git log --source
Now that you understand reachability, git branch --contains
will make sense. You give git branch --contains
some hash ID, e.g., the hash ID of commit B
or E
or A
. What git branch --contains
does is:
- starting from every branch name, work backwards;
- if this reaches the commit, print the branch name
so when used with the commit hash ID B
this will print br2
and br3
, as those are the two branch names that can reach B
.
The --source
option to git log
simply prints whichever name git log
was using at the time it found some commit. This is actually more complicated to explain, because git log
itself is pretty complicated!
What git log
does is walk the graph, printing some of the commits it encounters as it goes. That is, we give git log
some number of starting points, such as one or more branch names or commit hash IDs. The git log
command takes these names and resolves them to hash IDs, or takes the hash IDs (which are already hash IDs), and finds the named commits. It puts each commit into a priority queue.
If we run git log
with no arguments, git log
uses the special name HEAD
. This name is normally attached to one branch name. Using git switch
or git checkout
, we control which branch name HEAD
is attached-to; that's the branch that gets extended when we make a new commit, so it's pretty important! That branch name is the current branch, and that's what git log
shows by default: that is, running git log
with no arguments means git log
resolves HEAD
to the current commit's commit hash ID, and puts that (single) hash ID in the queue.
Now that the queue has some commit or commits in it, git log
takes the front entry off the queue. Since the queue is a priority queue, there's a sorting order, if there's more than one entry in it. But it's extremely common for the queue to have just the one entry! For instance, if we run git log
with no arguments, the current commit is the one entry in it when we start. If we run git log br1
, Git puts F
's hash ID into it, and again there's just the one entry.
Anyway, having taken the front entry out of the queue, git log
now decides, based on any arguments you gave like --no-merges
or whatever, whether to show this commit. If it's supposed to show the commit, it does that. We call this visiting the commit, as though we're on holiday and going to certain attractions or cities or whatever.
Next, having shown or not shown the commit, git log
finds the parent or parents of the commit. In your sample graph, each commit has exactly one parent, except for commit A
which has no parent. (A merge commit, if there were any, would have two parents.) By default, git log
puts all the parents into the queue, unless those parents have already been visited.
With its one parent, if we've just visited F
, git log
would put F
's parent A
into the queue. The queue was empty—F
was the only thing in it at the start of all of this—so now there's again just one entry in the queue. The git log
command now takes out and visits the one commit in the queue, i.e., commit A
. It shows commit A
, if it's supposed to do that, and then puts A
's parents into the queue. There are no parents, so this puts nothing in the queue, and the queue remains empty.
Once the queue is empty like this, git log
quits. So by starting at F
via name br4
, we visit commits F
and A
and stop, and that's what git log
would show.
If, on the other hand, we run git log --all
, the code will put D
, E
, C
, and F
all into the queue. There are now four entries so the priority really matters. This priority causes git log
to sort its output. The default sort is based on the stored committer date in each commit, with later commits being higher priority. So if commit F
is the latest commit, that's the one that surfaces first.
We'll visit F
, printing it out and putting its parent A
into the queue: the queue now contains A
, D, E
, and C
(in date order). Let's say that E
has the next-highest-priority date: git log
will pop E
out of the queue, visit it, and insert B
into the queue. Then git log
will take the highest priority commit out of the queue—let's continue the theme and say this is D
—and visit that one. This would put A
into the queue, but it's already there; it doesn't go in twice. We now visit C
, which wants to put B
in the queue, but it's already there; we then visit B
, which wants to put A
in the queue, but it's already there; and we visit A
, which is the last thing in the queue and puts nothing into the queue, and so git log
finally stops.
The --source
flag simply annotates each output, for any given commit, with the name that first led Git to this commit. So for C
, that's br3
.3 For B
, that's either br2
or br3
, depending on whether git log
visited C
or E
first.
The visiting order depends on the priority order. You can control this, to some extent at least, with options like --topo-order
or --author-date-order
. But in a big graph, especially one with a lot of branch-and-merge action in it, it's very difficult to know which of many names might first reach some commit. Only in small and simple graphs like yours here will you get something predictable.
3With git log --all
you will see refs/heads/br3
rather than just br3
. That's simply the full name of the branch. All branches have short names like br3
, and full ones like refs/heads/br3
. I like to think of the full name as what their mom (or spouse) says when she's mad at them, kind of like Stella Mudd in these ST:TOS clips.
Branch names don't matter
At the top, I asked why you care which branch(es) some commit is "on". Sometimes you will actually care, and then asking the question which branches contain this commit is fine. But if all you want to do is see the file, or see it as changes, just tell Git to show you the file, or show you the changes:
git show a123456:path/to/file
or:
git show a123456 -- path/to/file
The former shows the contents of the named file as stored in the named commit. The latter takes the named commit (using the abbreviated commit hash a123456
), finds its parent (singular4), and runs git diff
on the two commits. Then, because of the -- path/to/file
pathspec at the end, it shows only the diff for that one file. So you'll see what changed in that one file, in that one commit, with respect to its parent.
You can even extract the entire file from that commit, overwriting the current working tree copy, with git restore
:
git restore --source=a123456 --worktree -- path/to/file
Of course, you should first make sure you don't have anything valuable in path/to/file
, because the copy that's in your working tree is not in Git and Git cannot get it back after you tell Git to overwrite it. Only committed files are actually stored in Git.
This—the saved copy of the file, or the changes from the parent—is usually all you care about. These are easy to get once you have the commit's hash ID. That hash ID is the "true name" of the commit: it always works to identify that one particular commit.
The point of a branch name, in Git, is just to help you find commits. The hash IDs are their real names. They're just too ugly to deal with: we have to use mouse cut-and-paste or whatever, once we have found them. But if you have run a command that printed the hash ID of the commit you care about, just grab that with your mouse and get to work!
4A merge commit, which has two or more parents, causes a problem here. Each commit holds a full snapshot of every file. So to see what changed in some commit, we have Git use its backwards-pointing link from commit to parent. The parent also has a full snapshot, so the parent commit contains the same file, unless the file itself is all-new. Git can then extract the file from each commit and compare the two files and tell you what changed.
But a merge commit has two or more parents. That's what defines it as a merge commit in the first place. Since it does have at least two parents, we no longer know which parent commit to use to get the "earlier" version of the file. There are two or more earlier versions! The git log
command, when used as git log -p
to show commits as patches, cheats by default: it just does not bother to show anything at all. The git show
command works harder by default, doing something Git calls a combined diff. We won't go into any detail here though.