Could you please advice how I can determine the following details of a Git Branch
- Who created branch
- When branch was created
- which branch the new branch has been created off
Could you please advice how I can determine the following details of a Git Branch
You are putting too much faith into the idea of a branch. Git does not care about branches; Git cares about commits. But in order to say this, we must first stop and define the word branch. In this paragraph, what I mean by "branch" is a branch name like master
or develop
or feature/tall
. These names have, to Git, only one function: to record the raw hash ID of a commit.
Git has other entities that people can and do call a "branch". See What exactly do we mean by "branch"?
In any case, a branch name in Git holds the hash ID of one commit. That's actually very useful: both to us humans, for whom Git hash IDs look like random junk, and for Git itself. But that's all they hold by themselves. The important information is all stored elsewhere, mostly in the commits themselves.
Git really cares about commits. Commits store most of the data you're looking for: who made them, when, and from which other earlier commit.
Branch names do not store any of the data you are looking for. They have no creator information. They have no history of their own either (but we'll see something more in a moment). There is an important reason for this.
Unlike ClearCase, Git is a distributed version control system. In ClearCase, there is a central, managed storage location: the Volume Object Base or VOB. Individual users go to the central storage and get things from it, and put things back into it. To do this, these users must all share the VOB's names for things. So if the VOB wants to call something Bruce, well, everyone else might as well call it Bruce too.
With Git, each repository is independent. Everyone has his or her own branch names, independent of everyone else's branch names. The only truly universal name in Git is the hash ID. Everyone agrees on hash IDs: a commit whose hash ID is 08da6496b61341ec45eac36afcc8f94242763468
is 08da6496b61341ec45eac36afcc8f94242763468
in every Git repository. No other commit can have ID 08da6496b61341ec45eac36afcc8f94242763468
.
If you and I plan to connect our repositories together, we might want to coordinate our branch names now and then, and we can do that. But my repository has my branch names, and yours has yours. You are the creator of all of your branch names. I literally cannot create a branch name in your repository. All I can do is hand you a commit—by its big ugly hash ID—and then ask your Git to create or update some name.1
It's more common to set things up the other way around: I give you read only access to my repository. You run git fetch
to have your Git call up my Git over the Internet. Your Git asks my Git: Hey, other-Git, what branch names do you have? What are the universally-shared hash IDs that go with these branch names? My Git gives you my list of names-and-hash-IDs and your Git works from there:
master
is a123456...
. I have a123456...
, so I'm good.dev
is b789abc...
. Hm, I don't have b789abc...
. Perhaps I should ask for that commit by its hash ID.This goes on until your Git has all the information my Git hands over in this phase of the communication. Your Git then asks, by hash ID, for any commit it wants. If that commit's parent commit—another hash ID, stored inside the actual commit—represents a commit your Git does not have, your Git can ask for that commit by its hash ID, and so on.
Eventually, your Git gets back to some hash ID that your Git already has—in which case, your Git has the commit and doesn't need it at all—or I run out of commits and say "that's all there are".2 Your Git then creates, in your Git repository, every commit I've handed over.
The last thing your Git does, in this process, is to create or update your remote-tracking names. Here, instead of using your branch names—which are yours, not mine!—your Git changes all of my names, in a simple and easy-to-handle manner. Your Git uses a nickname for my Git, rather than a URL. Your Git's nickname is probably origin
—though you can, and in fact must, set a different one for every other Git you're going to call up like this. The name origin
is just the default name for the Git from which you made your initial clone.
So let's assume you're having your Git call my Git origin
. Your Git has a complete list of all of my branches and their corresponding (single) hash IDs. So your Git now creates or updates an origin/*
name, corresponding to my Git's names. My Git told your Git that my master
is a123456...
, so your Git creates or updates your origin/master
to point to commit a123456...
. You either already had this commit, so that mine did not have to send it to you, or did not but asked for it and my Git sent it. My Git told your Git that my dev
is b789abc...
; your Git has this commit now; your Git creates or updates your origin/dev
to point to b789abc...
.
Again, it's the commits that matter. Commits—and hash IDs—are the universal currency of Git. Someone else's branch names ... well, those are the someone else's to worry about. We'll copy them down, and save them as remote-tracking names, but they're not branch names. My branch names are mine; your branch names are yours. Whatever Git someone is using, their branch names are theirs, not anyone else's.
1It could be a good idea for you to record that it was me who made that update-request, but Git itself does not bother to do that. Git itself has no authentication or authorization of any kind. Git relies on some other entity to do any required authentication and/or authorization. Since someone else is doing that—if it's being done at all—Git also leaves any recording of such information to that other entity.
2Technically my Git just hands over a commit that has no parent, i.e., is a root commit. Since it has no parent, there is no earlier commit to ask for.
With the above in mind—the fact that your branch names are yours and you create and update them—we should mention that Git does actually keep logs of your various refs. These logs are also yours, to do with as you wish. But every time your Git updates a ref, it saves the previous value of the ref in a "ref log".
A ref (or reference) in Git is a generalization of branch names (like master
), tag names (like v2.1
), remote-tracking names (like origin/master
), and so on. Each kind of name still just stores one hash ID, but the "kind" of a name determines how you will use it.
A branch name is simply a ref whose full name starts with refs/heads/
. So your master
is really just the ref named refs/heads/master
. A tag name is a ref whose full name starts with refs/tags/
, so v2.1
is short for refs/tags/v2.1
. A remote-tracking name starts with refs/remotes/
and goes on to include the name of the remote—origin
—and another slash, so your remote-tracking name for origin
's master
is refs/remotes/origin/master
.
Whenever your Git updates any of these names, it keeps the old value in the reflog for that name. It puts the new value into that name, and puts the current date and time into the reflog along with the old value.
This means that if you want to know what hash ID your own master
had yesterday, you can have your Git look through your reflog for your master
. If it was updated earlier today, there's an entry in this reflog, and this shows that yesterday, it had whatever hash ID it had actually had yesterday.
These reflog entries eventually expire and get tossed out. This is very different from commits, which for the most part, live forever.3 It's also important to remember that your reflogs are yours. You cannot see mine.
There's a very special ref in Git spelled HEAD
(in all uppercase), which usually stores the name of a branch rather than a raw hash ID. It can store a raw hash ID; in this mode, Git says that you have a "detached HEAD". But mostly, it remembers the name of the branch that you are "on"—as in, if git status
says on branch master
, this means that HEAD
holds the name refs/heads/master
inside it. There is a reflog for HEAD, which stores the raw hash ID found by following through the name to the underlying commit hash.
3Technically, a commit lives as long as you can find it. This makes the concept of reachability extremely important. For much more about reachability, see Think Like (a) Git.
git push
is actually the odd case hereThe commit flow I described above is for git fetch
. Note that git pull
means run git fetch
, then run a second Git command. The second Git command runs only in your repository: your fetch
gets read-only access to some other Git repository, gets commits (and other data) from it, and then updates your remote-tracking names. The second command is the one that incorporates new commits into your branches. I like to delay that second command—I usually want to see what git fetch
actually fetched before I make any decision about taking, or not taking, new commits—so I rarely use git pull
at all.
Meanwhile, the closest thing there is to an opposite of fetch is git push
. This has your Git call up another Git—usually by your own name, again: git push origin
means you would like your Git to call up the Git at the URL your Git has stored under your name origin
. Having done so, your Git then does not Get commits from the origin Git. Instead, your Git sends commits (and other objects) to them.
Your Git will say: I have commit 9876543...
for you, do you have it yet? If not, your Git will send it over. If 9876543...
has parent a123456...
, your Git makes sure their Git has a123456...
too, and so on. Eventually, your Git reaches a commit that their Git already has, and does not have to send it, or reaches a root commit—one with no parent—so that there are no more commits to send.
Now that your Git has sent to their Git all the commits that you have, that they don't, that they'll need, your Git sends their Git a polite request: Please, if it's OK, set your master
to 9876543...
. It is up to their Git to accept or reject this request. Basic Git—as opposed to the fancier, for-sale versions from folks like GitHub and Bitbucket and Gitlab—has really simple rules here (though you can write your own fancy ones). Webhosting providers also provide user authentication and logging and so on, and it's up to them to give you fancier controls if you're the administrator. Basic Git's built in check is just: if I obey this request, will I still be able to find the branch-tip commit that is in the name already?
This test—will I be able to find the current commit, if I accept this new commit hash ID—is a test to see if an operation is a fast forward. This, once again, is all about reachability: starting at 9876543...
and working backwards through parent hash IDs, can I get to the hash ID I have in the name now? If the name is new, basic Git just accepts the request, and if the polite request is to delete a name, basic Git just accepts the request.
You can modify this polite request into a forceful command: set your master
to 9876543...
! They can still reject it, but again, basic Git will follow the command. The "is this a fast-forward" test goes away.
The rules for updating tag and other names are simpler (but were kind of broken up until Git 1.8.2): tag names simply can't be updated without --force
, and most other names use the branch-name rule ("is it a fast-forward").
When anyone uses git push
, this is where all the real authentication, authorization, and logging should happen at the receiving end. Basic Git assumes some other entity is doing those: if you've gotten past whatever mechanism the transport (https
or ssh
or whatever) provides, everything must be fine! If you're using a hosting provider, it is up to them to provide.