Md Samiul Alim's answer is fine, and also has the virtue of being short—many of mine don't as I won't take the time required to make them short—but if branch names aren't "clicking" for you, the reason might be as simple as this: Branch names, in Git, don't matter.
We—as in people, documentation, etc.—often talk about Git commits as being "on a branch". This isn't wrong. The problem is that it isn't right either. The very notion that some commit is "on" some branch is mushy, muddle-headed, and misleading. Commits in Git are their own thing: Commits exist independently of branches. A commit either is there, in the repository, or it isn't and therefore it does not exist. Commits are Git's raison d'être. In an important sense, nothing else matters: only the commits matter. Git is all about the commits.
Commits are ...
Git's commits are:
Frozen for all time. This is an inherent property of all of Git's internal objects, though this post won't explain why. The thing to remember here is that once we make some commit, not even Git can change it. The new, unique commit we just made is stuck the way we made it, forever. (If it's bad, we can just let Git eventually "forget it", though again I won't go into any detail here.)
Numbered. Every commit has a unique number, expressed in hexadecimal, which Git calls a hash ID or object ID. These things are huge, ugly, and generally impossible for humans to remember: 5a73c6bdc717127c2da99f57bc630c4efd8aed02
for example. Git needs the commit number to do anything with the commit, including simply to find it.
Snapshots. Each commit holds a full snapshot of every file, as of the form it had at the time you (or whoever) made the commit. The files inside any one commit are in a special form, readable only by Git itself and literally unwritable by anyone or anything (including Git itself). They're compressed and de-duplicated, so when most commits mostly re-use most of the files from other commits, the commits take almost no space because they're not actually storing the files again.
Nodes in a graph. This requires more explanation, but the way Git handles this is that besides storing a snapshot, each commit also stores some metadata, or information about the commit itself. That includes who made the commit and when. It includes a log message; you get to supply this log message at the time you run git commit
, if you're the one making the commit. And, it includes the raw commit numbers—the hash IDs—of a list of earlier commits.
Most commits store, in their metadata, the raw hash ID of exactly one previous commit, which we call the parent of the commit. That means that this commit itself remembers which (single) commit comes just before it. We say that a commit points to its parent, and if we want to draw this, we can do it like this:
<-H
Here, H
stands in for the hash ID of the latest commit. It has an arrow coming out of it, pointing backwards to its parent commit:
<-G <-H
The parent of H
is G
. But G
is a commit too, so it has another arrow coming out of it, pointing to its parent F
:
... <-F <-G <-H
This repeats all the way back to the very first commit ever: commit A
in our example repository here, which therefore has just eight commits in it, A
through H
. This first commit doesn't point back, because it can't: its list of previous commits is empty. That gives programs like git log
, which work by chasing along the backwards-pointing arrows, permission to quit, so that they don't have to run forever.
This nodes in a graph thing, or pointing backwards, is what makes most of Git work. All Git needs to read every commit in this simple case is for us to supply to Git the raw hash ID of the last commit, H
. But where will we keep this last commit hash ID? Do we jot it down in the office whiteboard? Do we write it down on a slip of paper and carry that around in our pockets? We could do either of these, but that's pretty painful. We have a computer: why don't we have the computer store the hash ID, perhaps in a file or something?
Branch and other names help us, and Git, find the commits
This is where branch names come in. A branch name is just an entry in some file—in a database of some sort—in which we have Git store the hash ID of the latest commit.
More precisely, the branch name stores the hash ID of the latest commit that is to be considered "on" that branch. Let's take note of several things here:
Branch names owe their existence to commits. It's not the other way around. The branch name cannot exist without a commit. The name points to a commit. The commit has to exist!
The hash ID stored in a branch name is not permanent. We can update the name. Like erasing the hash ID written on a whiteboard, we can replace the hash ID stored in the name. This means branch names "move around", at least when we draw them like I do (you'll see this in a moment).
If we already have a hash ID, we don't need a branch name. If we find some commit's hash ID somehow—regardless of how—we can give that to Git directly and not bother with a branch name.
More than one branch name can hold the same hash ID.
Let me illustrate the last part, and then show branch names moving around. We'll start with our simple chain ending at H
. For laziness and ASCII-art-on-Stack Overflow purposes, I will stop drawing the arrows between commits, but remember that like all parts of any commit, they're frozen for all time, pointing backwards from child to parent:
...--F--G--H <-- main
Here, all I did was add a branch name, main
, pointing to (storing the hash ID of) commit H
. This means commit H
is the latest commit on branch main
. Git can find every other commit by starting here and working backwards, so all the commits are on main
, but main
points to H
. (We call this the tip commit of branch main
.)
Now let's add a new branch name, br1
, and make it point to H
too:
...--F--G--H <-- br1, main
This means H
is now the latest commit on branch br1
. But it's the latest commit on branch main
. So which branch is it on? Git's answer is that it, and all the other commits too, are on both branches at the same time. By creating a new branch name, we changed which branches the commits are on. That's entirely normal in Git! Nothing about the commits has actually changed, it's just that now we have two names by which to find them.
Now that we do have two names, we need a way to remember which name we are using. Git does that for us by attaching a special name, HEAD
—this isn't a branch name and it's very much a reserved name in Git1—to one of the branch names, like this:
...--F--G--H <-- br1, main (HEAD)
Here, we're "on" branch main
—the git status
command will say on branch main
—and hence we're using commit H
. If we run:
git switch br1 # or git checkout br1
we get:
...--F--G--H <-- br1 (HEAD), main
We're now "on" br1
, but still using commit H
, so nothing else changes. Git says to itself: Oh, you'd like to switch branches to br1
, hm, that's the same commit we're using now, I don't really need to do anything except update the HEAD
, so I'll do that and say all done.
1That's why you should be careful to spell it in all uppercase, even if lowercase sometimes works. Or use @
, which is a one-character synonym for HEAD
. The mechanism Git currently uses to store the branch name in HEAD
is to have a file, .git/HEAD
. If this file ever goes missing, Git stops believing that the Git repository is a repository. If your computer ever crashes, because the HEAD
file tends to be active, your computer might decide it's been corrupted and remove it. Sometimes this makes Git declare that your repository isn't a repository any more. Putting a HEAD
file back in makes Git happy and your repository works again. That's a handy trick to know, if your computer crashes a lot.
Making new commits
You have already made some new commits, probably, so you know that you do this by modifying some files and running git add
and then git commit
. There is a lot more to know here—a whole bunch of secret stuff that a lot of Git tutorials don't cover very well—but let's jump right to the end now and look at what happens when you run git commit
.
You're already "on" some branch:
...--F--G--H <-- br1 (HEAD), main
You modified files and used git add
and now you run:
git commit
Git gathers up a commit log message from you—this is going into the new commit forever, so it's a good idea to write up a nice one, though there are ways to recover from mistakes here—and gets your name and email address from your user.name
and user.email
settings, in your personal Git configuration.2 Git gets the rest of the commit metadata—such as the date-and-time stamp—on its own. Among that rest-of-the-metadata, Git finds the raw hash ID of the current commit H
, and puts that into the metadata for the new commit, too.
Git makes the snapshot for the new commit from whatever is in Git's index aka staging area. This is why you had to run git add
. We won't go into the details here, but the index / staging-area content is already in the right format for a commit, so this part goes very fast.3 Git combines the snapshot and the assembled metadata and writes out a new commit, which gets a new, unique, big ugly hash ID, but we'll just call this I
. New commit I
thus points back to existing commit H
:
I
/
...--F--G--H
and now Git performs its clever trick: Git writes the new commit's hash ID into the current branch name. Since HEAD
is attached to br1
, this makes br1
point to I
instead of H
:
I <-- br1 (HEAD)
/
...--F--G--H <-- main
You now have a new commit on your new branch. Commit I
is now the latest commit on br1
. Commit H
continues to be on both branches—as are all the previous commits—but new commit I
is only on br1
, and is the tip commit of br1
now.
If we make a second new commit here we get:
I--J <-- br1 (HEAD)
/
...--F--G--H <-- main
The name br1
has moved to point to J
now, and now there are two commits that are only on br1
.
2This is why you have to set user.name
and user.email
. On some systems, Git is configured to guess these settings if necessary, so if you didn't have to set them, and git commit
does not error out with a message telling you to set them, it's using the guessing system. Did it guess right? If not, configure these.
3I've seen people complain about how slow this part is. They don't know "slow". (Insert four Yorkshiremen comedy sketch here.) Seriously, pre-Git version control systems sometimes gave you time here to go out for lunch.
Cloning: we don't have to use branch names
Suppose some other Git repository has this:
I--J <-- br1
/
...--F--G--H <-- main (HEAD)
We now use git clone
to copy that repository's commits.
The git clone
command:
- makes a new, empty directory (or folder if you prefer that term) and enters the new directory;
- creates a new, empty Git repository—one with no commits and no branches—here and does the rest of its Git operations here;
- adds what Git calls a remote,
origin
, to store the URL of some existing Git repository;
- runs any extra
git config
operations that might be required (we didn't need any here);
- runs
git fetch
to connect to the other Git repository and download all of its commits; and
- creates, in this new repository, one branch and checks that one out.
The one branch that git clone
creates in step 6 is the branch you name on the command line:
git clone -b br1 <url>
will have your Git create the name br1
, instead of the name main
. If you don't give a -b
argument, your Git asks their Git which name they recommend. They recommend whichever name their HEAD
is attached to, so your Git will make a main
.
You can tell your Git not to create and check out any branch name, using --no-checkout
. Then you have no branches at all. You don't actually need any branches yet—you only need them when you start creating commits—but it "feels weird" to use a branchless Git repository. Still, let's draw the branchless repository you would get this way:
I--J <-- origin/br1
/
...--F--G--H <-- origin/main
Note how your Git has taken their branch names, br1
and main
, and changed them. These are your own Git's remote-tracking names. They remember the branch names the other Git repository had, as of the time you ran git fetch
. That was via git clone
in this case, but a later git fetch
will get any new commits from them, and then update your remote-tracking names.
So, in Git, cloning a repository means get all their commits and none of their branches. The last step—step 6—of a normal clone, though, is a git checkout
, and in the right circumstances—including this one—git checkout
will make a branch name.
When clone or checkout makes a branch name like this, it uses the remote-tracking name that your Git set up to find the right commit, and makes a branch name in your repository that points to that commit. So if we let git clone
create main
, we get:
I--J <-- origin/br1
/
...--F--G--H <-- main (HEAD), origin/main
Whichever branch we have our Git create here, that's the branch we're on, so if we choose br1
instead, we get:
I--J <-- br1 (HEAD), origin/br1
/
...--F--G--H <-- origin/main
Guesswork branch creation
Let's say we cloned with -b br1
above. If we run git checkout main
, Git will search our branches and not find a main
. The same happens with git switch main
(the newer Git command).
Instead of just summarily spitting out an error message, though, Git now uses what used to be called DWIM (Do What I Mean) and is now controlled by --guess
or --no-guess
. Git will look through our remote-tracking names. If one of them looks right, Git will assume we meant:
- using the remote-tracking name's commit,
- make a new branch name, and
- switch to that branch
which will, in our case, do this:
I--J <-- br1, origin/br1
/
...--F--G--H <-- main (HEAD), origin/main
and now we have the same two branch names they had, pointing to the same two commits. We're now "on" commit H
via branch main
. We're ready to make new commits on either of our branches.
GitHub "fork"
The fork button on GitHub is really a special case of cloning but with added features.
If we run git clone
to clone some existing GitHub repository on our laptops, we get a new repository on our laptop with no branches (until we create one, maybe as the last step of the clone) but all the commits. We get a name origin
that remembers the URL, too.
If we use the GitHub FORK button on a web page, though, GitHub will:
- clone the repository on GitHub for us, into a "fork" in our GitHub account;
- link the two repositories together, so that we can make "pull requests"—also a GitHub feature; and
- cleverly copy all their branch names into our GitHub fork.
That last step is very different from git clone
. They do it because they don't make remote-tracking names in the fork: they only make branch names instead.
The linking-together is in some ways vaguely similar to the way our Git saves a URL under the name origin
. However, the linking-together on GitHub lets GitHub save a lot of disk space: they can and do literally share underlying files over on GitHub. That's not something that you need to care about, and when you clone a repository locally, you can't use their (GitHub's) disk drives. But it drives a lot of the decisions GitHub make, in terms of why they do things a little differently from Git-on-your-laptop.
This brings us back to your original questions
when I do git fetch upstream model-package
. It fetches from the upstream to the local repo right?
This git fetch
is a limited form of git fetch upstream
. The unlimited one:
- has your Git call up their Git via the remote name
origin
and the URL stored there;
- has your Git list out all their branches and commit hash IDs; and
- has your Git get any new commits they have, that you don't, and then update all your remote-tracking names.
The limited one does the same thing but skips updating all but upstream/model-package
, and doesn't bother getting commits that won't add to your upstream/model-package
. This may go slightly faster now, at the cost of going slightly slower tomorrow if/when you want other commits and/or remote-tracking names updated.
What happens when I do git checkout model-package? Does it point to the remote model-package branch?
No, or not quite:
If you have a branch name model-package
, your local Git picks out that name and attaches HEAD
there, and checks out that commit.
If you don't have model-package
, your Git uses --guess
. If there's both an origin/model-package
and an upstream/model-package
, your Git says that it has too many choices here and gives up. If there's only one matching name, your Git creates a new branch name, as below for -b
(lowercase).
What happens when I do git checkout -B model-package upstream/model-package
? Does it creates a new model-package branch on the local repo that keeps on sync with the upstream/model-package branch?
With a lowercase -b
, your Git:
- finds
upstream/model-package
(the remote-tracking name);
- creates a new
model-package
pointing to the same commit, and checks that one out.
If your Git can't create a new branch model-package
—because there's an existing one in the way—your Git gives you an error here.
If not, you now have a model-package
. Git sets the upstream—that's a different term than upstream
, though unfortunately horribly similar looking—for the new branch to origin/model-package
. The upstream setting of a branch is just a loose connection for various operations; see Why do I have to "git push --set-upstream origin <branch>"? and Why do I need to do `--set-upstream` all the time?
With an uppercase -B
option, the "error out" variant doesn't happen. Git says: Oh I see you already have a model-package
... I should complain, but, uppercase B, I guess you want me to delete that one and create a new one in its place, more or less. So the old model-package
name, which selected some particular commit, is simply overwritten. The new model-package
name, which occupies the same database slot as the old one, now points to the same commit as upstream/model-package
.
The precise details of what happens when resetting an existing model-package
branch are a little tricky. The documentation does not go into complete detail. In particular, if you're on that branch right now, does it error out? I'd have to experiment to see.