Regarding your secondary question:
... how can i check the branch structure. For example if Dev branch is a sub branch of main
Branches—or more precisely, branch names—are never super- or sub-branches of any other branch names. They all live at the same "level", as it were:
The first trick here is that in Git, the branch names don't really matter. What really matters are the commits themselves. Those commits exist regardless of which branches contain them.
The second trick—which doesn't undo the first one!—is that the branch names are how we find the commits. Since the commits matter, this makes the branch names matter ... but only to the extent that we need those branch names to find the commits.
As Jatin Mehrotra commented, you can use git log --graph
(usually with additional flags) to ask Git to draw a graph of your commits. For (much) more about this, see Pretty git branch graphs. For StackOverflow posting purposes, I like to draw my graphs horizontally, using either o
characters for each commit, like this:
...--o--o--o <-- master
or single uppercase letters to stand in for the actual hash IDs of each commit. You'll see the real hash IDs of commits when you run git log
. They are very big and ugly and impossible for mere humans to deal with—you need a computer to use the hash IDs—which is why, in my drawings, I use the letters instead:
... <-F <-G <-H <-- master
Here, we have a string of commits ending at commit H
(H
is short for Hash-ID). Git finds the hash ID of commit H
because the branch name master
holds it. That is, we can literally ask Git: What hash ID does the name master
represent?
$ git rev-parse master
225365fb5195e804274ab569ac3cc4919451dc7f
So H
here might stand in for 225365fb5195e804274ab569ac3cc4919451dc7f
.
Commit H
itself is made up of two parts. Both parts are completely read-only! Nothing can ever change these, so commit H
saves these two things forever:
One part of the commit is a snapshot of every file. The files inside a commit are stored in a special, read-only, Git-only, compressed and de-duplicated form. They are, in effect, an archive of your files. The de-duplication is there because most commits mostly re-use files from some other commit, and this means that the re-used files don't take any space.
The other part of the commit is some metadata, or information about the commit itself. This includes the name and email address of the person who made the commit, for instance. It includes the date-and-time of when they made the commit. It includes their log message in which they should explain why they made the commit. And, for Git's own internal purposes, the metadata includes the actual hash ID of some earlier commit or commits.
The earlier commit—the one that comes before commit H
—has some big ugly hash ID, such as 140045821aa78da3a80a7d7c8f707b955e1ab40d
(let's call that G
for short). This looks random,1 so there's no easy way to associate one hash—or one commit—with the other, except for the fact that commit H
has commit G
's hash ID inside H
. We say that H
points to G
.
Commit G
, in turn, stores a snapshot—a full set of every file; comparing the snapshot in G
to the one in H
shows what files got changed—and metadata. The metadata in G
points to a still-earlier commit, F
. Commit F
, like G
and H
, stores a snapshot and metadata, so from F
Git can go backwards to an even-earlier commit, and so on down the line.
1It's actually a cryptographic checksum, similar in some ways to some of the tricks that power digital currencies.
Branch names
This stuff—the commits with their snapshots and metadata—is mostly what's really in a repository. The snapshots hold files, which Git can extract for you to see and work on / with, and metadata, which Git can use to find earlier commits. But to use the latest commit, we have to know which one is the latest. That's where the branch names come in.
Each branch name holds just one hash ID. The hash ID stored in the branch name is, by definition, the last commit that's on that branch. So, suppose we have this:
I--J <-- branch1
/
...--G--H
\
K--L <-- branch2
This drawing indicates that the branch name branch1
holds the hash ID of commit J
, i.e., points to commit J
. Commit J
is therefore the last commit on this branch.
Similarly, branch2
holds the hash ID of commit L
. Commit L
is therefore the last commit on this branch.
Which branch is commit H
on? This is a bit of a trick question, but the diagram itself is meant as a hint:
Note that commit I
is on branch1
, so the fact that J
points back to I
means that commit I
is on branch1
. J
is the last commit on branch1
so earlier commits are also on branch1
.
Commit K
is on branch2
, by virtue of commit L
being on branch2
.
Commit H
is thus on both branches at the same time. At least, it is in Git. (Other version control systems do this differently.)
In fact, if we add a new branch name, pointing to commit H
, like this:
I--J <-- branch1
/
...--G--H <-- main
\
K--L <-- branch2
that just means that commit H
is now on all three branches (and is the last, or tip, commit on main
). If we remove the name main
, commit H
is now only on the remaining two branches.
This is why—and more importantly, when—branch names don't matter. Since we can find H
in two other ways, we can remove the name main
entirely. We'll just find H
by starting at one of the other two branch names and working backwards. Note, though, that we will have "forgotten" that commit H
was supposed to be the last commit on main
, once we no longer have a main
.
This brings us back to your original question and a key to understanding Git
When you clone an existing Git repository:
$ git clone ssh://git@github.com/user/repo.git
for instance, your computer makes a new Git repository. It then copies all the commits from the other Git repository—which your Git reaches by calling up their Git at the URL you give—so that you have all of their commits ... but it does not copy any of their branch names yet.
Instead, your Git has their Git list out all their branch names, along with those commit hash IDs for the tip commits of their branches. Your Git then takes their branch names and changes them. Your Git turns their branch names into your remote-tracking names,2 such as origin/main
or origin/develop
. It does this by sticking the remote name origin
(and the slash) in front of their branch names.3
The result is that if they have:
I--J <-- branch1
/
...--G--H <-- main
\
K--L <-- branch2
then you get this instead:
I--J <-- origin/branch1
/
...--G--H <-- origin/main
\
K--L <-- origin/branch2
Note that at this point, you have no branch names at all.
Usually, the last step of git clone
is to run git checkout
(or in Git 2.23 and later, git switch
). It's this step that creates your main
branch, provided you have your Git check out that name—main
—in the first place.
When you run git checkout main
, your Git:
- checks to see if you have a branch named
main
(no, or at least, not yet);
- if that fails, checks to see if there's a file named
main
(this is a drawback / bad-feature / bug in git checkout
, corrected in git switch
, and is the reason that the Git folks added git switch
in Git 2.23); and finally
- goes to look for an
origin/main
that your Git could use to create a branch named main
.4
This last part works, so your Git creates main
, and sets it up to "track" origin/main
: your own Git's remote-tracking name for some other Git's branch name main
.
This kind of thing happens both on an initial git clone
and on a git checkout
that you run yourself. That gives you:
I--J <-- origin/branch1
/
...--G--H <-- main, origin/main
\
K--L <-- origin/branch2
If you now ran git checkout branch1
, your Git would notice that there's no actual branch named branch1
, and would end up using the aha, there is an origin/branch1
that I can use to create a (local) branch name branch1
that picks out commit J
trick that is built in to git checkout
and git switch
.
Git calls this particular trick DWIM, which stands for Do What I Mean: create the branch, as if I ran git checkout -b branch1 --track origin/branch1
. It also now calls it guessing, and git checkout --no-guess
tells it not to try the DWIM code: if you don't have a branch1
, don't search for a remote-tracking name that can be used to create one.
2Git documentation calls these remote-tracking branch names. You can use that phrase if you prefer it, and you should expect Git documentation to use it—but many humans accidentally drop the wrong word here and start saying remote-tracking branch, instead of remote-tracking name. If you drop the redundant word branch, you won't make this particular mistake. The fact is that you can't get "on" a remote-tracking name the way you get "on" a branch, so it's important to know that these aren't really branch names.
3You can, if you like, pick a different name than origin
for your remote name. A remote name is mostly a way to store the URL that you had to type earlier, so that later, you can just say origin
instead of ssh://...
or https://...
. But note that the actual remote name you pick, whatever it is, is also the string that Git sticks in front of their branch names. If you choose xyzzy
instead of origin
, you get xyzzy/main
and the like, instead of origin/main
and so on.
4This process is more complicated than I'm showing here. The details depend on your particular Git version, too. But with a fresh clone that made an origin/main
, that's the end result.
Growing a branch by making commits
Note that right after using DWIM mode to create a (local) branch name, the branch name and the remote-tracking name both select the same commit, like this:
...--G--H <-- main, origin/main
Git now, however, says that you are on branch main
. If you run git status
, it prints out the literal text on branch main
. To symbolize that in this kind of drawing, I like to attach the special name HEAD
, written in all uppercase, to the branch name:
...--G--H <-- main (HEAD), origin/main
If you now make a new commit—we won't cover the mechanics of doing this, we'll just assume that you know how to do it—Git will make you supply a log message for the new commit, to explain to yourself and others why you made the commit. That commit gets a new, totally-unique hash ID. This hash ID cannot in use in any other Git repository anywhere, which explains why the hash IDs are so big and ugly, and once it is yours, it cannot ever be used again either.5 But we'll just call it I
here:
...--G--H
\
I
Note how new commit I
points backwards to existing commit H
.
It's at this point that Git does its really big magic trick: Git writes the new hash ID for the new commit into the branch name, using the HEAD
attachment to know which branch-name gets updated. So now we have:
...--G--H <-- origin/main
\
I <-- main (HEAD)
While HEAD
is still attached to main
, the name main
itself is now pointing to new commit I
.
5Re-using a commit hash ID for a different commit doesn't break anything if the two Git repositories never come in contact with each other. But in practice we never have to worry about it; it's as secure as Git's cryptographic hash function. Well, er, um: How does the newly found SHA-1 collision affect Git?
This is why every Git repository has its own private branch names
As long as your branch names point to the same commits as your remote-tracking names, you don't need a branch name, at least, not yet. You could just git checkout origin/main
. But this results in what Git calls a detached HEAD, which I draw like this:
...--G--H <-- HEAD, origin/main
If you make a new commit in this mode, you get:
...--G--H <-- origin/main
\
I <-- HEAD
There's no branch name by which to find commit I
. This is OK temporarily but if you now git checkout
something else, we lose the actual hash ID of commit I
. So that's why you want to be "on" a branch before you make a new commit.
(There are ways to recover from mistakes, so it's not fatal to make a commit while in detached-HEAD mode. It's just ... annoying.)