1

Could you please advice how I can determine the following details of a Git Branch

  • Who created branch
  • When branch was created
  • which branch the new branch has been created off
Nicolas Pepinster
  • 5,413
  • 2
  • 30
  • 48
Pete Long
  • 107
  • 2
  • 11
  • 3
    I guess the question is about how to get this **in gitlab**, since git itself does not record any of these data. – Romain Valeri Nov 08 '19 at 10:44
  • Roman, that is interesting point, I was not aware. I would would prefer to see if I can get this information via git command line if possible. Yes I have access to GitLab, I find the Repository -> graph option difficult to make sense. – Pete Long Nov 08 '19 at 11:02
  • 1
    Yes, unfortunately, git *cannot* store this volatile information, given its very design. Branches come and go, their metadata (creation time, author, etc.) are not "just temporary", they are actually never stored anywhere by git itself. What you *could* do is infer things based on knowledge of *your* specific workflow, or rely on external tools. – Romain Valeri Nov 08 '19 at 11:08
  • A bit surprised why Git does not store such information that I would thought is useful. For example I look at a branch 2/3/4 months later, I have really appreciation of it. The 1st thing I would like to know is who,when, why ( off which branch ) .. before I can even start to understand the reason for the branch . Am I missing a point ? – Pete Long Nov 08 '19 at 12:46
  • 1
    Branches are just named pointers to commits. But you can always see all the commit metadata (which, with proper messages and commit practice, can give you those insights) – D. Ben Knoble Nov 08 '19 at 12:59
  • @PeteLong You're not missing a point, you're trying to apply the standards and concepts of a world in another one. The info you need very much depends on *workflows* eventually, and in well set-up git environments, we never lack the info we actually *use*. What you're talking about when you say "branches" is not what git technically labels as such. In other source control systems, yes, branches are stable "lines of development". I'd suggest reading a few definitional articles about git architecture and its pros/cons, it'll probably help you see it better. – Romain Valeri Nov 08 '19 at 13:08
  • Yes I guess I have a lot to appreciate/understand about git. Just wanted to say under GitLaB, it allows the creation, update, delete etc of "branches" which would lead me to think such info about a branch would be available. You may say, ref to the repositiory -> graph diagram. But that it self is not easy to follow. My 1st source code management was Clearcase. To this day, I have not really found anything that good ... off course that was a licensed product. – Pete Long Nov 08 '19 at 13:43

1 Answers1

1

You are putting too much faith into the idea of a branch. Git does not care about branches; Git cares about commits. But in order to say this, we must first stop and define the word branch. In this paragraph, what I mean by "branch" is a branch name like master or develop or feature/tall. These names have, to Git, only one function: to record the raw hash ID of a commit.

Git has other entities that people can and do call a "branch". See What exactly do we mean by "branch"?

In any case, a branch name in Git holds the hash ID of one commit. That's actually very useful: both to us humans, for whom Git hash IDs look like random junk, and for Git itself. But that's all they hold by themselves. The important information is all stored elsewhere, mostly in the commits themselves.

Git really cares about commits. Commits store most of the data you're looking for: who made them, when, and from which other earlier commit.

Branch names do not store any of the data you are looking for. They have no creator information. They have no history of their own either (but we'll see something more in a moment). There is an important reason for this.

Unlike ClearCase, Git is a distributed version control system. In ClearCase, there is a central, managed storage location: the Volume Object Base or VOB. Individual users go to the central storage and get things from it, and put things back into it. To do this, these users must all share the VOB's names for things. So if the VOB wants to call something Bruce, well, everyone else might as well call it Bruce too.

With Git, each repository is independent. Everyone has his or her own branch names, independent of everyone else's branch names. The only truly universal name in Git is the hash ID. Everyone agrees on hash IDs: a commit whose hash ID is 08da6496b61341ec45eac36afcc8f94242763468 is 08da6496b61341ec45eac36afcc8f94242763468 in every Git repository. No other commit can have ID 08da6496b61341ec45eac36afcc8f94242763468.

If you and I plan to connect our repositories together, we might want to coordinate our branch names now and then, and we can do that. But my repository has my branch names, and yours has yours. You are the creator of all of your branch names. I literally cannot create a branch name in your repository. All I can do is hand you a commit—by its big ugly hash ID—and then ask your Git to create or update some name.1

It's more common to set things up the other way around: I give you read only access to my repository. You run git fetch to have your Git call up my Git over the Internet. Your Git asks my Git: Hey, other-Git, what branch names do you have? What are the universally-shared hash IDs that go with these branch names? My Git gives you my list of names-and-hash-IDs and your Git works from there:

  • Torek's Git says his master is a123456.... I have a123456..., so I'm good.
  • Torek's Git says his dev is b789abc.... Hm, I don't have b789abc.... Perhaps I should ask for that commit by its hash ID.

This goes on until your Git has all the information my Git hands over in this phase of the communication. Your Git then asks, by hash ID, for any commit it wants. If that commit's parent commit—another hash ID, stored inside the actual commit—represents a commit your Git does not have, your Git can ask for that commit by its hash ID, and so on.

Eventually, your Git gets back to some hash ID that your Git already has—in which case, your Git has the commit and doesn't need it at all—or I run out of commits and say "that's all there are".2 Your Git then creates, in your Git repository, every commit I've handed over.

The last thing your Git does, in this process, is to create or update your remote-tracking names. Here, instead of using your branch names—which are yours, not mine!—your Git changes all of my names, in a simple and easy-to-handle manner. Your Git uses a nickname for my Git, rather than a URL. Your Git's nickname is probably origin—though you can, and in fact must, set a different one for every other Git you're going to call up like this. The name origin is just the default name for the Git from which you made your initial clone.

So let's assume you're having your Git call my Git origin. Your Git has a complete list of all of my branches and their corresponding (single) hash IDs. So your Git now creates or updates an origin/* name, corresponding to my Git's names. My Git told your Git that my master is a123456..., so your Git creates or updates your origin/master to point to commit a123456.... You either already had this commit, so that mine did not have to send it to you, or did not but asked for it and my Git sent it. My Git told your Git that my dev is b789abc...; your Git has this commit now; your Git creates or updates your origin/dev to point to b789abc....

Again, it's the commits that matter. Commits—and hash IDs—are the universal currency of Git. Someone else's branch names ... well, those are the someone else's to worry about. We'll copy them down, and save them as remote-tracking names, but they're not branch names. My branch names are mine; your branch names are yours. Whatever Git someone is using, their branch names are theirs, not anyone else's.


1It could be a good idea for you to record that it was me who made that update-request, but Git itself does not bother to do that. Git itself has no authentication or authorization of any kind. Git relies on some other entity to do any required authentication and/or authorization. Since someone else is doing that—if it's being done at all—Git also leaves any recording of such information to that other entity.

2Technically my Git just hands over a commit that has no parent, i.e., is a root commit. Since it has no parent, there is no earlier commit to ask for.


Git does keep some logs of its own

With the above in mind—the fact that your branch names are yours and you create and update them—we should mention that Git does actually keep logs of your various refs. These logs are also yours, to do with as you wish. But every time your Git updates a ref, it saves the previous value of the ref in a "ref log".

A ref (or reference) in Git is a generalization of branch names (like master), tag names (like v2.1), remote-tracking names (like origin/master), and so on. Each kind of name still just stores one hash ID, but the "kind" of a name determines how you will use it.

A branch name is simply a ref whose full name starts with refs/heads/. So your master is really just the ref named refs/heads/master. A tag name is a ref whose full name starts with refs/tags/, so v2.1 is short for refs/tags/v2.1. A remote-tracking name starts with refs/remotes/ and goes on to include the name of the remote—origin—and another slash, so your remote-tracking name for origin's master is refs/remotes/origin/master.

Whenever your Git updates any of these names, it keeps the old value in the reflog for that name. It puts the new value into that name, and puts the current date and time into the reflog along with the old value.

This means that if you want to know what hash ID your own master had yesterday, you can have your Git look through your reflog for your master. If it was updated earlier today, there's an entry in this reflog, and this shows that yesterday, it had whatever hash ID it had actually had yesterday.

These reflog entries eventually expire and get tossed out. This is very different from commits, which for the most part, live forever.3 It's also important to remember that your reflogs are yours. You cannot see mine.

There's a very special ref in Git spelled HEAD (in all uppercase), which usually stores the name of a branch rather than a raw hash ID. It can store a raw hash ID; in this mode, Git says that you have a "detached HEAD". But mostly, it remembers the name of the branch that you are "on"—as in, if git status says on branch master, this means that HEAD holds the name refs/heads/master inside it. There is a reflog for HEAD, which stores the raw hash ID found by following through the name to the underlying commit hash.


3Technically, a commit lives as long as you can find it. This makes the concept of reachability extremely important. For much more about reachability, see Think Like (a) Git.


git push is actually the odd case here

The commit flow I described above is for git fetch. Note that git pull means run git fetch, then run a second Git command. The second Git command runs only in your repository: your fetch gets read-only access to some other Git repository, gets commits (and other data) from it, and then updates your remote-tracking names. The second command is the one that incorporates new commits into your branches. I like to delay that second command—I usually want to see what git fetch actually fetched before I make any decision about taking, or not taking, new commits—so I rarely use git pull at all.

Meanwhile, the closest thing there is to an opposite of fetch is git push. This has your Git call up another Git—usually by your own name, again: git push origin means you would like your Git to call up the Git at the URL your Git has stored under your name origin. Having done so, your Git then does not Get commits from the origin Git. Instead, your Git sends commits (and other objects) to them.

Your Git will say: I have commit 9876543... for you, do you have it yet? If not, your Git will send it over. If 9876543... has parent a123456..., your Git makes sure their Git has a123456... too, and so on. Eventually, your Git reaches a commit that their Git already has, and does not have to send it, or reaches a root commit—one with no parent—so that there are no more commits to send.

Now that your Git has sent to their Git all the commits that you have, that they don't, that they'll need, your Git sends their Git a polite request: Please, if it's OK, set your master to 9876543.... It is up to their Git to accept or reject this request. Basic Git—as opposed to the fancier, for-sale versions from folks like GitHub and Bitbucket and Gitlab—has really simple rules here (though you can write your own fancy ones). Webhosting providers also provide user authentication and logging and so on, and it's up to them to give you fancier controls if you're the administrator. Basic Git's built in check is just: if I obey this request, will I still be able to find the branch-tip commit that is in the name already?

This test—will I be able to find the current commit, if I accept this new commit hash ID—is a test to see if an operation is a fast forward. This, once again, is all about reachability: starting at 9876543... and working backwards through parent hash IDs, can I get to the hash ID I have in the name now? If the name is new, basic Git just accepts the request, and if the polite request is to delete a name, basic Git just accepts the request.

You can modify this polite request into a forceful command: set your master to 9876543...! They can still reject it, but again, basic Git will follow the command. The "is this a fast-forward" test goes away.

The rules for updating tag and other names are simpler (but were kind of broken up until Git 1.8.2): tag names simply can't be updated without --force, and most other names use the branch-name rule ("is it a fast-forward").

When anyone uses git push, this is where all the real authentication, authorization, and logging should happen at the receiving end. Basic Git assumes some other entity is doing those: if you've gotten past whatever mechanism the transport (https or ssh or whatever) provides, everything must be fine! If you're using a hosting provider, it is up to them to provide.

torek
  • 448,244
  • 59
  • 642
  • 775
  • 1
    hey torek that is some response ... in fact it is a sterling response. There is a alot for me to take in all at once. I will be looking over and try to understand the detail over the weekend. So for know, just wanted to say thank you the reply ...... i may just have some follow up queries... have a good weekend ALL. – Pete Long Nov 08 '19 at 18:49