Why master branch change too before `commit` changed

Question

sometimes I write some change in the second branch and I don't want to commit it yet,
where I get back to master I found the master branch changed too, why?

UPDATE

when i edited file in seconde branch, it's seem like i edited from master too, this file is not part from any branch until commit to some branche, sometimes is powerful and sometimes make-me confusing

Hi You are supposed to use the second branch until you are satisfied with the changes and then merge it back into master. You can watch or read some tutorials about branches. For example: https://www.atlassian.com/git/tutorials/using-branches — Peter Krebs, Dec 07 '21 at 08:15
as an aside, `git checkout -b master` means create a new branch called master. The `-b` flag means you want a new branch based off the current branch. To switch back to an existing branch omit the `-b`. See here https://www.atlassian.com/git/tutorials/using-branches/git-checkout#:~:text=that%20branch.%20Additionally%2C-,The,will%20create%20the%20new%20branch%20and%20immediately%20switch%20to%20it,-.%20You%20can%20work — gingerbreadboy, Dec 07 '21 at 09:03
Does this answer your question? [git checkout branch after git add, seems not update index and working area](https://stackoverflow.com/questions/67450004/git-checkout-branch-after-git-add-seems-not-update-index-and-working-area) — matt, Dec 07 '21 at 12:23

torek · Answer 1 · 2021-12-07T12:03:53.520

The files you see and work on, in your working tree, are not in any branch.

The way to understand this is to remember the following rules:

Git is about commits.
Git is not about files, although commits hold files.
Git is not about branches, although branch names help us (and Git) find commits.

What Git cares about—what Git stores and transmits to other Git repositories—are the commits.

Each commit:

is numbered: every commit has a big, ugly, random-looking, unique hash ID. The hash ID of some commit is how Git knows that that commit is that commit. Your Git will present this hash ID to some other Git software; if that other Git software, working with its repository, has a commit with this number, it has this commit. If it does not have this number, it needs to get this commit from your Git repository, if your Git is offering it. (The same goes in the other direction, when you have your Git add new commits to your repository, obtained from some other Git repository.)
is read-only: no commit can ever be changed, not even by Git itself.
stores two things: a full snapshot of every file that Git knew about at the time you, or whoever, made the commit; and some metadata, or information about the commit, such as who made it and when. Note that the files stored in a commit are kept in a special, read-only, Git-only format, compressed and de-duplicated. Your computer can't read these files (well, it can read the raw data, but it can't make sense of it) and nothing—not even Git itself—can overwrite these files (because of the same hashing scheme that's used for commits).

Because commits and their files can only be read by Git itself, Git has to extract a commit before you can do any work with it. This is what git checkout does: it extracts the commit.

When you switch from one branch to another—whether with git checkout or the newer git switch—you may be telling Git to switch from one commit to another commit. In this case, Git has to remove the files that came out of the commit you were using, and replace them with files that came out of the commit you will be using. Before Git does this, though, it checks to make sure that any files it removes-and-replaces aren't actually modified. That way you won't lose work you've done but have not yet committed.

If you have done work, and haven't committed it, the work you have done so far is not in Git. It is merely in the files Git extracted earlier, that you changed since then. So switching from one branch to another will not show anything, because those files aren't in Git.

This whole system can be pretty confusing, so let's say a bit more.

Commits are numbered, and link to earlier commits

Whenever you, or anyone, make a new commit, the new commit gets a new, unique, random-looking hash ID. That hash ID is—and must be—different from the hash ID of every other commit everywhere in the universe.¹ That new commit gets written out, and from then on, it can never be changed.

The new commit, as it's being written out, can have the hash ID of some older commit stored inside it. This makes the new commit "point to" the older commit. Once we've repeated this trick a few times, we have a chain of commits. If we call them by single uppercase letters—this is easier for humans to understand—we get a drawing that looks like this:

A <-B <-C

where C is the latest commit. We say that commit B, which came just before C, is C's parent. Commit C points to its parent B. Commit C also holds a full snapshot of every file. Commit B, of course, is also a commit and holds a full snapshot of every file and points to its parent A. Commit A is a commit and holds a snapshot, but since commit A is the first commit, it can't point backwards to any earlier commit, so it just doesn't.

By starting with the latest commit and working backwards, Git can find all the commits. So we only need to remember the last commit's hash ID.

¹This is technically impossible, and someday Git will fail to work. The large size of the hash IDs tries to put that day so far in the future that we don't care that it won't actually work forever: we'll all be long dead before there's a problem. At least, that's the idea, but we're already running into a few other minor issues, so Git is getting a new even-bigger hash ID scheme.

Branch names help us (and Git) find commits

The system above works fine as long as we remember the actual hash ID of commit C. But who can remember some big ugly hexadecimal number like that? I can't, and you probably can't. We could write these down, perhaps ... but hey, wait a minute, we have a computer. Let's have the computer store the number of the latest commit. We'll put it in a small database of names. Let's call them branch names and tag names and the like.

Now that we have names, we can add them to our drawing. Each name points to some commit:

A--B--C   <-- master

Here, the branch name master points to commit C. Let's add another branch name, seconde-branch, that also points to commit C, like this:

A--B--C   <-- master, seconde-branch

We now need a way to remember which name we are using. Let's use the special name HEAD for this:

A--B--C   <-- master (HEAD), seconde-branch

This indicates that we are using commit C as our current commit, via the name master. If we now:

git checkout seconde-branch

we get:

A--B--C   <-- master, seconde-branch (HEAD)

We're still using commit C, but now we're using it via the name seconde-branch.

When we change branches like this, we're not changing which commit we're using. So Git does not have to remove-and-replace any files at all, and therefore, Git doesn't bother. This lets us switch to the other branch, in case we forgot and started editing files too soon.

Git's index and your working tree

As I mentioned above, when we first check out or switch to some branch, Git will—if needed—extract all the files from the snapshot in the commit as found by the branch name. These files are in some weird Git-only format, compressed and de-duplicated, but now they're regular everyday files.

These files go in a work area. Git calls this our working tree or work-tree. The files here came out of a commit but are not actually in Git: they're just ordinary files in ordinary folders. Git has no control over these files: you can do anything you want with them.

When you have done something with them, you'd typically like to save the things you did. For this purpose you'll need to make a new commit. In other version control systems, you'd run their commit verb (e.g., hg commit or svn commit) and they'd scan your working tree, find what you changed, and make the new commit. Git, however, is different. Git makes you run git add.

What git add does is copy the updated file back into a secret—well, not really secret, but invisible—Git area that, in effect, sits between a commit and your working tree. This area is extremely important in Git, at least if you ever plan to make any new commits. (If you don't need new commits, you can mostly ignore it.) Because it is so important, and/or because it is badly named, this area has three names: Git calls it the index, the staging area, and—rarely these days—the cache.

(You can—to a limited extent—get by with git commit -a instead of git add. Don't do this! You'll be able to ignore the index for a while, but eventually, Git will whack you over the head with its index. Learn about the index. Embrace it. Some people find it useful: there are clever tricks you can do with it. Some find it annoying, but it's there, in the way, and you need to know about it so you don't trip over it.)

Git's index is a complicated thing, but it plays one pretty constant role, and can therefore be described in one line this way: The index holds your proposed next commit. The initial git checkout or git switch that you run extracts the commit's files to Git's index.

The files in Git's index are in the compressed and de-duplicated form that Git uses internally. The key difference between these files, and the files in a commit, is that the commit cannot be changed, but the index contents can be changed. Running git add tells Git: Make the index copy of this file look like the work-tree copy.

What this means is that after git add, you've updated your proposed next commit. When you first check out commit C, or are on commit C with modified working tree files like this:

A--B--C   <-- master, seconde-branch (HEAD)

the index still holds the original files that were extracted from commit C. Until you run git commit—which will write out the index's files into a permanent form in a new commit—the index copies are just sitting around ready to go into a new commit.

Running git add updates the index copies, making them match the working tree copies. So this means that with, e.g., file.txt, there are three copies:

  HEAD         index      work-tree
---------    ---------    ---------
file.txt     file.txt     file.txt

As you modify the work-tree copy, nothing happens to the other two copies. If we put version numbers in the table above, we get:

   HEAD           index        work-tree
-----------    -----------    -----------
file.txt(1)     file.txt(1)   file.txt(2)

When you run git add file.txt, Git updates the index copy to match the work-tree copy:

   HEAD           index        work-tree
-----------    -----------    -----------
file.txt(1)     file.txt(2)   file.txt(2)

Note that you can change the work-tree copy again, without using git add, and at this point all three copies will differ.

(Note that if you run git add on a new file, that isn't in the index yet, Git will copy this new file into the index. This adds the new file to the proposed commit. It's not yet in any commit, but it's now ready to be committed. Or, you can run git rm on a file to remove it from both the index and your working tree. Now it's gone from the index, so it won't be in the next commit. This does not affect any existing commits: those cannot be changed.)

When you run git commit, this is what happens:

Git gathers any metadata it needs, such as your name and email address, and the current date-and-time. It may collect a log message from you, or use the -m argument to get the log message.
Git uses the current commit's hash ID to go into the metadata for the new commit.
Git writes out whatever files are in the index.
Git turns the above into a commit. This creates the commit's unique hash ID. (One reason for the date-and-time-stamp is that since this is always changing, the hash ID will differ from that of any other commit that is otherwise exactly the same.)
This is the sneaky bit. Now that the new commit exists, Git writes the new commit's hash ID into the current branch name.

This means that if you currently have:

A--B--C   <-- master, seconde-branch (HEAD)

and you run git commit and it successfully makes a new commit, you now have:

A--B--C   <-- master
       \
        D   <-- seconde-branch (HEAD)

Note how master still points to commit C, but seconde-branch now points to new commit D. Commit D points back to existing commit C as its parent. No commits have changed, but there is now a new commit in the repository.

If you now run:

git checkout master     # or git switch master

Git must now remove the commit-D files and replace them with the commit-C files. Git has to do this for all the files that are different. It can cheat a bit, and for any file that is the same in commits C and D, it can leave that file alone in the index and working tree.

(A degenerate case of "leave the file alone" occurs when switching from commit C to commit C: there's no change at all, so all files can be left alone. That's the case you're seeing in your example, and it's always true for the kind of git checkout -b you are using.)

But if some files are different, Git will have to remove-and-replace those. Here Git will first make sure that these files in your working tree aren't changed; if they are, Git will refuse to switch commits. You can force the switch anyway, telling Git throw away my changes. (Since those changes are in your working tree, which is not in Git, Git will not be able to help you recover from this. So don't ignore Git's complaints about files that would be overwritten. Figure out why you haven't saved them!)

Let's switch to commit C again:

A--B--C   <-- master (HEAD)
       \
        D   <-- seconde-branch

We can now make a new commit on master, by changing some files, or adding new files, or removing files, or some combination of all three. We git add any updates if necessary and run git commit, and after it succeeds, we have a new commit, with a new big ugly hash ID, but we'll just call it E:

        E   <-- master (HEAD)
       /
A--B--C
       \
        D   <-- seconde-branch

Now, think about this: Which commits are on which branches? In particular, which branch(es) hold commits A-B-C? Remember that Git always starts with the last commits—of which there are now two, commits D and E—and works backwards.

Working backwards from E, we also traverse commits C, then B, then A. So these commits should all be on master. Working backwards from D, we also traverse C, then B, then A. So these commits should all be on seconde-branch.

So: which branch(es) are commits A-B-C on? I'll leave this as an exercise, but will note that Git's answer is very different from, say, Mercurial's.

This question comes up from time to time, perhaps this question and answer here can serve as a duplicate closure target going forward? You put a lot of time and effort into this answer here, it should be possible to use that for future questions as well. — Lasse V. Karlsen, Dec 07 '21 at 09:05
@LasseV.Karlsen: Yes, I tried to keep this one more limited in scope. There's no perfect answer, unfortunately, as different people come to this problem in different ways sometimes. — torek, Dec 07 '21 at 09:06
this is the perfect answer, but my question get a negative point, so why do we spend time if another does not give them a positive point — nextloop, Dec 07 '21 at 14:01

Why master branch change too before `commit` changed

UPDATE

1 Answers1

Commits are numbered, and link to earlier commits

Branch names help us (and Git) find commits

Git's index and your working tree

Linked