Git Checkout Vs Git Checkout -B with upstream repo

Question

Say I forked the Upstream. My remote repo has a model-package branch.

when I do git fetch upstream model-package. It fetches from the upstream to the local repo right? What happens when I do git checkout model-package? Does it point to the remote model-package branch?

What happens when I do git checkout -B model-package upstream/model-package? Does it creates a new model-package branch on the local repo that keeps on sync with the upstream/model-package branch? If it's so will it replace the previous model-package branch?

I tested practically, but quite did not get it. Can someone clarify this? Thanks.

score 1 · Answer 1 · edited Nov 17 '21 at 04:38

Git Checkout Vs Git Checkout -B with upstream repo

git fetch upstream model-package — This command downloads updates to model-package branch, and saves them to upstream/model-package.

You can compare changes to the local model-package branch via git diff model-package upstream/model-package, and if you're happy with the changes, you can integrate them via git pull upstream model-package, which will apply all the changes to your local model-package branch.

git checkout model-package — This will move you from your current branch to the mentioned branch, model-package. All your changes in this branch will be in your local until you push to the remote branch.

git checkout -B model-package upstream/model-package — This will create a new model-package branch on the local repo that tracks the upstream model-package branch. It won't sync with your upstream remote branches until you push the changes to the remote branch. You can change remote-tracking branch (sync) at any time, with any remote, any branch, it's completely up to you. Click here for more information on, how to change remote tracking branch.

Note: Correct me If any points that I have mentioned here are incorrect.

score 1 · Accepted Answer · answered Nov 17 '21 at 20:46

Md Samiul Alim's answer is fine, and also has the virtue of being short—many of mine don't as I won't take the time required to make them short—but if branch names aren't "clicking" for you, the reason might be as simple as this: Branch names, in Git, don't matter.

We—as in people, documentation, etc.—often talk about Git commits as being "on a branch". This isn't wrong. The problem is that it isn't right either. The very notion that some commit is "on" some branch is mushy, muddle-headed, and misleading. Commits in Git are their own thing: Commits exist independently of branches. A commit either is there, in the repository, or it isn't and therefore it does not exist. Commits are Git's raison d'être. In an important sense, nothing else matters: only the commits matter. Git is all about the commits.

Commits are ...

Git's commits are:

Frozen for all time. This is an inherent property of all of Git's internal objects, though this post won't explain why. The thing to remember here is that once we make some commit, not even Git can change it. The new, unique commit we just made is stuck the way we made it, forever. (If it's bad, we can just let Git eventually "forget it", though again I won't go into any detail here.)
Numbered. Every commit has a unique number, expressed in hexadecimal, which Git calls a hash ID or object ID. These things are huge, ugly, and generally impossible for humans to remember: _{^{5a73c6bdc717127c2da99f57bc630c4efd8aed02}} for example. Git needs the commit number to do anything with the commit, including simply to find it.
Snapshots. Each commit holds a full snapshot of every file, as of the form it had at the time you (or whoever) made the commit. The files inside any one commit are in a special form, readable only by Git itself and literally unwritable by anyone or anything (including Git itself). They're compressed and de-duplicated, so when most commits mostly re-use most of the files from other commits, the commits take almost no space because they're not actually storing the files again.
Nodes in a graph. This requires more explanation, but the way Git handles this is that besides storing a snapshot, each commit also stores some metadata, or information about the commit itself. That includes who made the commit and when. It includes a log message; you get to supply this log message at the time you run git commit, if you're the one making the commit. And, it includes the raw commit numbers—the hash IDs—of a list of earlier commits.

Most commits store, in their metadata, the raw hash ID of exactly one previous commit, which we call the parent of the commit. That means that this commit itself remembers which (single) commit comes just before it. We say that a commit points to its parent, and if we want to draw this, we can do it like this:

<-H

Here, H stands in for the hash ID of the latest commit. It has an arrow coming out of it, pointing backwards to its parent commit:

        <-G <-H

The parent of H is G. But G is a commit too, so it has another arrow coming out of it, pointing to its parent F:

... <-F <-G <-H

This repeats all the way back to the very first commit ever: commit A in our example repository here, which therefore has just eight commits in it, A through H. This first commit doesn't point back, because it can't: its list of previous commits is empty. That gives programs like git log, which work by chasing along the backwards-pointing arrows, permission to quit, so that they don't have to run forever.

This nodes in a graph thing, or pointing backwards, is what makes most of Git work. All Git needs to read every commit in this simple case is for us to supply to Git the raw hash ID of the last commit, H. But where will we keep this last commit hash ID? Do we jot it down in the office whiteboard? Do we write it down on a slip of paper and carry that around in our pockets? We could do either of these, but that's pretty painful. We have a computer: why don't we have the computer store the hash ID, perhaps in a file or something?

Branch and other names help us, and Git, find the commits

This is where branch names come in. A branch name is just an entry in some file—in a database of some sort—in which we have Git store the hash ID of the latest commit.

More precisely, the branch name stores the hash ID of the latest commit that is to be considered "on" that branch. Let's take note of several things here:

Branch names owe their existence to commits. It's not the other way around. The branch name cannot exist without a commit. The name points to a commit. The commit has to exist!
The hash ID stored in a branch name is not permanent. We can update the name. Like erasing the hash ID written on a whiteboard, we can replace the hash ID stored in the name. This means branch names "move around", at least when we draw them like I do (you'll see this in a moment).
If we already have a hash ID, we don't need a branch name. If we find some commit's hash ID somehow—regardless of how—we can give that to Git directly and not bother with a branch name.
More than one branch name can hold the same hash ID.

Let me illustrate the last part, and then show branch names moving around. We'll start with our simple chain ending at H. For laziness and ASCII-art-on-Stack Overflow purposes, I will stop drawing the arrows between commits, but remember that like all parts of any commit, they're frozen for all time, pointing backwards from child to parent:

...--F--G--H   <-- main

Here, all I did was add a branch name, main, pointing to (storing the hash ID of) commit H. This means commit H is the latest commit on branch main. Git can find every other commit by starting here and working backwards, so all the commits are on main, but main points to H. (We call this the tip commit of branch main.)

Now let's add a new branch name, br1, and make it point to H too:

...--F--G--H   <-- br1, main

This means H is now the latest commit on branch br1. But it's the latest commit on branch main. So which branch is it on? Git's answer is that it, and all the other commits too, are on both branches at the same time. By creating a new branch name, we changed which branches the commits are on. That's entirely normal in Git! Nothing about the commits has actually changed, it's just that now we have two names by which to find them.

Now that we do have two names, we need a way to remember which name we are using. Git does that for us by attaching a special name, HEAD—this isn't a branch name and it's very much a reserved name in Git¹—to one of the branch names, like this:

...--F--G--H   <-- br1, main (HEAD)

Here, we're "on" branch main—the git status command will say on branch main—and hence we're using commit H. If we run:

git switch br1        # or git checkout br1

we get:

...--F--G--H   <-- br1 (HEAD), main

We're now "on" br1, but still using commit H, so nothing else changes. Git says to itself: Oh, you'd like to switch branches to br1, hm, that's the same commit we're using now, I don't really need to do anything except update the HEAD, so I'll do that and say all done.

¹That's why you should be careful to spell it in all uppercase, even if lowercase sometimes works. Or use @, which is a one-character synonym for HEAD. The mechanism Git currently uses to store the branch name in HEAD is to have a file, .git/HEAD. If this file ever goes missing, Git stops believing that the Git repository is a repository. If your computer ever crashes, because the HEAD file tends to be active, your computer might decide it's been corrupted and remove it. Sometimes this makes Git declare that your repository isn't a repository any more. Putting a HEAD file back in makes Git happy and your repository works again. That's a handy trick to know, if your computer crashes a lot.

Making new commits

You have already made some new commits, probably, so you know that you do this by modifying some files and running git add and then git commit. There is a lot more to know here—a whole bunch of secret stuff that a lot of Git tutorials don't cover very well—but let's jump right to the end now and look at what happens when you run git commit.

You're already "on" some branch:

...--F--G--H   <-- br1 (HEAD), main

You modified files and used git add and now you run:

git commit

Git gathers up a commit log message from you—this is going into the new commit forever, so it's a good idea to write up a nice one, though there are ways to recover from mistakes here—and gets your name and email address from your user.name and user.email settings, in your personal Git configuration.² Git gets the rest of the commit metadata—such as the date-and-time stamp—on its own. Among that rest-of-the-metadata, Git finds the raw hash ID of the current commit H, and puts that into the metadata for the new commit, too.

Git makes the snapshot for the new commit from whatever is in Git's index aka staging area. This is why you had to run git add. We won't go into the details here, but the index / staging-area content is already in the right format for a commit, so this part goes very fast.³ Git combines the snapshot and the assembled metadata and writes out a new commit, which gets a new, unique, big ugly hash ID, but we'll just call this I. New commit I thus points back to existing commit H:

             I
            /
...--F--G--H

and now Git performs its clever trick: Git writes the new commit's hash ID into the current branch name. Since HEAD is attached to br1, this makes br1 point to I instead of H:

             I   <-- br1 (HEAD)
            /
...--F--G--H   <-- main

You now have a new commit on your new branch. Commit I is now the latest commit on br1. Commit H continues to be on both branches—as are all the previous commits—but new commit I is only on br1, and is the tip commit of br1 now.

If we make a second new commit here we get:

             I--J   <-- br1 (HEAD)
            /
...--F--G--H   <-- main

The name br1 has moved to point to J now, and now there are two commits that are only on br1.

²This is why you have to set user.name and user.email. On some systems, Git is configured to guess these settings if necessary, so if you didn't have to set them, and git commit does not error out with a message telling you to set them, it's using the guessing system. Did it guess right? If not, configure these.

³I've seen people complain about how slow this part is. They don't know "slow". (Insert four Yorkshiremen comedy sketch here.) Seriously, pre-Git version control systems sometimes gave you time here to go out for lunch.

Cloning: we don't have to use branch names

Suppose some other Git repository has this:

             I--J   <-- br1
            /
...--F--G--H   <-- main (HEAD)

We now use git clone to copy that repository's commits.

The git clone command:

makes a new, empty directory (or folder if you prefer that term) and enters the new directory;
creates a new, empty Git repository—one with no commits and no branches—here and does the rest of its Git operations here;
adds what Git calls a remote, origin, to store the URL of some existing Git repository;
runs any extra git config operations that might be required (we didn't need any here);
runs git fetch to connect to the other Git repository and download all of its commits; and
creates, in this new repository, one branch and checks that one out.

The one branch that git clone creates in step 6 is the branch you name on the command line:

git clone -b br1 <url>

will have your Git create the name br1, instead of the name main. If you don't give a -b argument, your Git asks their Git which name they recommend. They recommend whichever name their HEAD is attached to, so your Git will make a main.

You can tell your Git not to create and check out any branch name, using --no-checkout. Then you have no branches at all. You don't actually need any branches yet—you only need them when you start creating commits—but it "feels weird" to use a branchless Git repository. Still, let's draw the branchless repository you would get this way:

             I--J   <-- origin/br1
            /
...--F--G--H   <-- origin/main

Note how your Git has taken their branch names, br1 and main, and changed them. These are your own Git's remote-tracking names. They remember the branch names the other Git repository had, as of the time you ran git fetch. That was via git clone in this case, but a later git fetch will get any new commits from them, and then update your remote-tracking names.

So, in Git, cloning a repository means get all their commits and none of their branches. The last step—step 6—of a normal clone, though, is a git checkout, and in the right circumstances—including this one—git checkout will make a branch name.

When clone or checkout makes a branch name like this, it uses the remote-tracking name that your Git set up to find the right commit, and makes a branch name in your repository that points to that commit. So if we let git clone create main, we get:

             I--J   <-- origin/br1
            /
...--F--G--H   <-- main (HEAD), origin/main

Whichever branch we have our Git create here, that's the branch we're on, so if we choose br1 instead, we get:

             I--J   <-- br1 (HEAD), origin/br1
            /
...--F--G--H   <-- origin/main

Guesswork branch creation

Let's say we cloned with -b br1 above. If we run git checkout main, Git will search our branches and not find a main. The same happens with git switch main (the newer Git command).

Instead of just summarily spitting out an error message, though, Git now uses what used to be called DWIM (Do What I Mean) and is now controlled by --guess or --no-guess. Git will look through our remote-tracking names. If one of them looks right, Git will assume we meant:

using the remote-tracking name's commit,
make a new branch name, and
switch to that branch

which will, in our case, do this:

             I--J   <-- br1, origin/br1
            /
...--F--G--H   <-- main (HEAD), origin/main

and now we have the same two branch names they had, pointing to the same two commits. We're now "on" commit H via branch main. We're ready to make new commits on either of our branches.

GitHub "fork"

The fork button on GitHub is really a special case of cloning but with added features.

If we run git clone to clone some existing GitHub repository on our laptops, we get a new repository on our laptop with no branches (until we create one, maybe as the last step of the clone) but all the commits. We get a name origin that remembers the URL, too.

If we use the GitHub FORK button on a web page, though, GitHub will:

clone the repository on GitHub for us, into a "fork" in our GitHub account;
link the two repositories together, so that we can make "pull requests"—also a GitHub feature; and
cleverly copy all their branch names into our GitHub fork.

That last step is very different from git clone. They do it because they don't make remote-tracking names in the fork: they only make branch names instead.

The linking-together is in some ways vaguely similar to the way our Git saves a URL under the name origin. However, the linking-together on GitHub lets GitHub save a lot of disk space: they can and do literally share underlying files over on GitHub. That's not something that you need to care about, and when you clone a repository locally, you can't use their (GitHub's) disk drives. But it drives a lot of the decisions GitHub make, in terms of why they do things a little differently from Git-on-your-laptop.

This brings us back to your original questions

when I do git fetch upstream model-package. It fetches from the upstream to the local repo right?

This git fetch is a limited form of git fetch upstream. The unlimited one:

has your Git call up their Git via the remote name origin and the URL stored there;
has your Git list out all their branches and commit hash IDs; and
has your Git get any new commits they have, that you don't, and then update all your remote-tracking names.

The limited one does the same thing but skips updating all but upstream/model-package, and doesn't bother getting commits that won't add to your upstream/model-package. This may go slightly faster now, at the cost of going slightly slower tomorrow if/when you want other commits and/or remote-tracking names updated.

What happens when I do git checkout model-package? Does it point to the remote model-package branch?

No, or not quite:

If you have a branch name model-package, your local Git picks out that name and attaches HEAD there, and checks out that commit.
If you don't have model-package, your Git uses --guess. If there's both an origin/model-package and an upstream/model-package, your Git says that it has too many choices here and gives up. If there's only one matching name, your Git creates a new branch name, as below for -b (lowercase).

What happens when I do git checkout -B model-package upstream/model-package? Does it creates a new model-package branch on the local repo that keeps on sync with the upstream/model-package branch?

With a lowercase -b, your Git:

finds upstream/model-package (the remote-tracking name);
creates a new model-package pointing to the same commit, and checks that one out.

If your Git can't create a new branch model-package—because there's an existing one in the way—your Git gives you an error here.

If not, you now have a model-package. Git sets the upstream—that's a different term than upstream, though unfortunately horribly similar looking—for the new branch to origin/model-package. The upstream setting of a branch is just a loose connection for various operations; see Why do I have to "git push --set-upstream origin <branch>"? and Why do I need to do `--set-upstream` all the time?

With an uppercase -B option, the "error out" variant doesn't happen. Git says: Oh I see you already have a model-package... I should complain, but, uppercase B, I guess you want me to delete that one and create a new one in its place, more or less. So the old model-package name, which selected some particular commit, is simply overwritten. The new model-package name, which occupies the same database slot as the old one, now points to the same commit as upstream/model-package.

The precise details of what happens when resetting an existing model-package branch are a little tricky. The documentation does not go into complete detail. In particular, if you're on that branch right now, does it error out? I'd have to experiment to see.