The git switch keeps the ignored folder

Question

I have a master and test branch. In a test branch i have a raw folder that is listed in the .gitignore file. When I switch between master and test branches, I can see all the differences in the files as normal. The problem is that the raw folder is kept when I switch from test to master. This folder is ment to be the test-branch's folder and I want it to be removed from the directory when I switch to master branch. How can I make it?

score 3 · Answer 1 · answered Dec 27 '20 at 20:25

3

You’ve explicitly told Git to ignore the folder, so Git operations won’t affect the folder.

If you want Git to manage the folder, remove the entry from .gitignore and commit the folder on a branch.

answered Dec 27 '20 at 20:25

grg

5,023
3
34
50

torek · Answer 2 · 2020-12-28T00:45:08.863

This folder [named raw] is meant to be the test-branch's folder and I want it to be removed from the directory when I switch to master branch. How can I make [Git do that]?

You can't. More precisely, you can't do that and make it ignored. You can, however, get close: maybe close enough for your purposes. You can simply make the files not ignored.

The reason for this is simple enough, but requires a bit of explanation and attention to detail. (If you want, you can skip straight to the Conclusion section below.)

Git is about commits

You're thinking about Git as if it stored files kept in folders. This simply is not the case. What Git stores is not files but rather commits.

Now, each commit is made up of files, so you might think what's the difference? The difference is pretty big:

The files in a commit are not in folders. They just have names that have slashes in them, such as path/to/file.ext.
You can't work with part of a commit.¹ You always have a whole commit, or else you don't have the commit at all.
The files that are inside a commit are strictly read-only. In fact, everything about every commit is read-only: no part of any existing commit can ever be changed. (Not by you, and not even by Git itself.)

That last part is perhaps the most critical. The files inside a commit are kept in a special, read-only, Git-only, compressed and de-duplicated format. The de-duplication is important because every commit contains a full snapshot of every file. Most commits mostly re-use the same files as some previous commit, and by de-duplicating the read-only, Git-only files, Git avoids having its internal storage requirements grow too huge.

But, because nothing but Git itself can read those files, and literally nothing can write them, you can't actually get any work done with the files inside a commit as they appear in that commit. What this means is that Git must extract all the files from some commit, before you can work with that commit.

When you go to make a new commit, Git takes all the files you have told it about—the full snapshot—and uses those to make the new commit. The files in the new commit are then read-only: they are now saved forever, or at least, as long as that new commit keeps existing.

This is what git checkout or git switch is really about, then: it extracts a commit so that you can work on it. The files that you can then see and read and write and otherwise use are not in Git. They are in a work area that is yours to play with as you like. Git calls this work area your working tree or work-tree. (As of Git 2.5, you are not limited to just one working tree. You can create more of them with git worktree add, although this answer won't go into any detail about that.)

¹This is not strictly true: there are some ways to use Git (not the normal everyday ways) that would let you work with part of a commit. This won't help solve your problem, though.

Branch names only exist to let you find commits

Your branch names test and master have only one thing to do with storing files. Remember that, as noted above, Git's storage consists of commits (which then contain files that can be extracted for you to use). Each commit is found by its hash ID. These hash IDs are the big ugly numbers that git log prints before showing each commit. The hash IDs look random (though they aren't actually random), and are unpredictable, and unsuitable for humans. About the only way to get them right is to use the mouse to cut-and-paste them, or similar. So humans don't normally use these hash IDs (except via cut-and-paste as needed). Instead, Git gives us branch names.

What a branch name really does is store one single commit hash ID. This commit is, by definition, the last commit of that branch. We say that the branch name points to the commit. Because holding the raw hash ID of some commit allows Git to find the commit, the branch name lets us find this last commit, which Git calls the tip commit of the branch.

(Side question: is the branch just a name? Is it just the last commit? Is it something else entirely? The answer to these questions is "yes", or perhaps "no". The problem here is that the word branch means something different every time we say it. See also What exactly do we mean by "branch"?)

Now, the secret trick here is that each commit also points to some earlier commit(s). A commit isn't just a snapshot of every file. That's the main data of a commit, to be sure; but a commit also contains some metadata, or information about the commit itself: who made it, and when, for instance. One piece of metadata that Git sets up for itself is the hash ID of the previous commit.

This forms commits into backwards-looking chains. From the last commit, we can find the second-to-last commit. From that commit, we can find the third-to-last, and from there we can step back again to the fourth-to-last, and so on. If we draw commits using single uppercase letters to stand in for the big ugly hash IDs, we get a picture that looks like this:

... <-F <-G <-H   <--branch-name

The name, in this case branch-name, holds the hash ID of the last commit, which is how we—or at least Git—will find commit H. Commit H holds the hash ID of earlier commit G, which is how Git will find commit G. Commit G holds the hash ID of earlier commit F, and so on.

What this all means is that each branch name identifies one particular commit, but that's good enough to find every previous commit. But "previous" doesn't exactly mean earlier in time. We call this earlier or previous commit the parent of the commit. Most commits have just one parent, as shown here.

Making multiple branches

Suppose we have a chain of backwards-pointing commits—and I'm going to get lazy about drawing the arrows between then here—that looks like this:

...--F--G--H   <-- br1 (HEAD)

The HEAD in parentheses here indicates that the branch we have checked out (or git switch-ed to) is br1. That means our current branch is br1, and our current commit is the commit whose hash is H (well, whatever hash ID H really stands in for).

Let's now make a new name, br2, that also points to commit H, like this:

...--F--G--H   <-- br1 (HEAD), br2

Note that at this point, all the commits we are showing here are on both branches.

Now let's make a couple of new commits, while still on branch br1:

             I   <-- br1 (HEAD)
            /
...--F--G--H   <-- br2

Our act of making a new commit told Git to:

package up a new snapshot;
add our name and email address as the person who made this new commit;
gather a log message from us;
use the current date-and-time as "when the commit is made";
use commit H, the current commit at that time, as the parent, so that I points back to H;

and then—this is the key to how branch names work—Git writes I's hash ID, whatever it is, into the current branch name, so that the name br1 now finds commit I.

New commit I is only on branch br1. It is the last commit on that branch. But commit H is the last commit on branch br2. Being the last commit on some branch doesn't mean there cannot be any later commits. It just means that the branch name finds that particular commit as its tip commit. So H is now the tip commit of br2.

We can go on to make one more commit on br1, if we want:

             I--J   <-- br1 (HEAD)
            /
...--F--G--H   <-- br2

If we now run git checkout br2 or git switch br2, here's what Git has to do:

Figure out which files exist in br1. That is, figure out the set of files that go with commit J, the branch-tip commit.
Figure out which files exist in br2: that is, the set of files that go with commit H.
For the files that exist in both branch tip commits, some of them might be different. If so, replace the working tree copies of those files.
Some files might only exist in one of the two commits. If so, remove or create the appropriate file in the working tree.
Last, re-attach the special name HEAD to the chosen branch name.

We can draw this as:

             I--J   <-- br1
            /
...--F--G--H   <-- br2 (HEAD)

and we note that the working tree files have changed around, to match the tip commit of br2. That is, our working tree is now that of commit H, not that of commit J.

At this point, we can make a new commit or two as usual. They'll each get new unique hash IDs, which we will call K and L, and this will give us:

             I--J   <-- br1
            /
...--F--G--H
            \
             K--L   <-- br2 (HEAD)

As we switch back and forth between these two commits, Git will:

remove from our working tree, any files that are in this commit but aren't in the other commit;
add to our working tree, any files that are in the other commit but aren't in this commit;
change, in our working tree, any files that are in both commits, but are different in the two commits.

For Git to do this, the files must actually be in one of those two commits. For Git to remove files, the files must actually be absent from the other of these two commits.

Git's index or staging area

Now that you know the big secret of branches—that the word branch is ambiguous, that branch names find tip commits; that commits contain files (but not folders—Git just makes folders in your working tree when it has to do that to keep your OS happy); and that checking out or switching to some commit fills in your working tree from that commit—we're ready to take a closer look at the process of making a new commit.

Above, we just blithely assumed you know how to make new commits. Of course, you do know how to do that. But there is another big secret here. Well, it's not really a secret, it's just that a lot of Git tutorials don't present it properly, and/or people miss it because there's so much stuff to learn all at once.

This secret, if it is really a secret, is that Git doesn't make new commits from the files in your working tree. Instead, Git makes new commits—the snapshot part, at least—from the files that are in Git's index.

This name, "index", is not a very good name. It's kind of meaningless. So this thing has two more names. All three names refer to the same thing. The other main name is staging area, and this refers to how Git actually uses the index. Git makes you "stage" a file for committing, by running git add. You no doubt already know this—but the secret part is that git add doesn't just copy a file into Git's index. Most of the time, that file was already in Git's index before. What git add does is replace the file.

More precisely, git add tells Git: Make the index copy of this file match my work-tree copy. Git will in fact remove the index copy of the file, if you've removed your work-tree copy, because that's how to make it match. If the file is not in Git's index at all at this point, Git will copy it into its index. Either way, once git add is done, the file in Git's index now matches the one (or the lack of one) in your working tree. But the file may well already have been in Git's index.

The files in the index are in the special, read-only, Git-only, de-duplicated format. The difference between the index copy and the committed copy is simply that the index copy can be replaced (by removing it wholesale, then inserting a different file: we never actually overwrite the index file since it's probably shared with some earlier commit).² So, in effect, what's in Git's index is your proposed next commit.

To summarize this section, then, Git's index:

has three names: the index, the staging area, and the cache (the last one mostly shows up in command line flags these days, e.g., git rm --cached);
holds your proposed next commit;
holds a third copy of each file.

You already know there's a frozen committed copy of each file, which Git used to create your working tree copy, so that's two copies. The index holds a third copy. It starts out matching the committed copy, and keeps matching it until and unless you use git add or git rm or similar to change or remove it.

When you run git commit, Git makes the new snapshot from whatever is in Git's index at this time. That's why you have to git add files over and over again. You're not just telling Git that you changed the file, you're actually storing the updated file into Git's index, ready to go into the next commit. (This makes git commit go incredibly fast, if you're used to older, pre-Git version control systems. You could run their commit commands and go out for lunch sometimes.)

²Technically, what's in the index is:

the file's name (including any embedded slashes);
the file's mode (executable or non-executable);
an internal blob hash ID; and
a bunch of cache data and other internal but temporary data that don't go into the commits themselves.

The git add process compresses the file down to a pre-de-duplicated blob, or re-uses an existing blob if there is one with the right content. It then updates the index entry. There's more that happens when the index has been expanded for a conflicted merge, but this answer does not cover any of that.

Tracked, untracked, and ignored files, and the `git status` command

Once you properly understand Git's index / staging-area / cache, then—and only then—do Git's rules about untracked and ignored files make any sense. The reason for this is simple and has to do with the definition of an untracked file, which is: An untracked file is a file that exists in your working tree right now, that is not in Git's index right now. That's it: that is all there is to being untracked.

A tracked file, of course, is one that's in Git's index. Generally these files probably should also be in your working tree. Git doesn't actually need a copy to exist in your working tree, since the next commit will just use the index / staging-area copy, but it's weird to have the file in Git's index and not have it in your working tree.

An ignored file is one that not only is untracked but also is listed in your .gitignore file or similar. Entries in ignore-files are used to suppress complaints about untracked files.

When you run git status, Git:

prints out the current branch name and other helpful stuff;
compares the current commit—the one found by HEAD, as we saw above—to the files in Git's index;
compares the index to your working tree.

Anything that Git finds, in step 2, that's different, Git calls staged for commit. Since the index is the proposed next commit, whatever files are in it that are the same as the current commit aren't very interesting, and whatever files are in it that are different are interesting. So these are the names that Git prints.

Anything that Git finds in step 3 that's different is also important. These are files you could git add. If you did, that would change the proposed next commit. So Git prints these out as unstaged files. (A tracked file that's missing from your working tree will show up as an unstaged delete, here.)

But—here's the oddity—instead of saying that there are unstaged files that could be added that aren't in Git's index at all, and calling those unstaged added files, Git calls them untracked files. So if you have a bunch of files you don't want to commit, git status will whine about them, reminding you that it's time to add them to the index / staging-area.

This whining is annoying. To make git status shut up, we list the file names in an ignore file. Then git status sees them as untracked, but simply does not complain. But note that if the files are in the index—will be in the proposed next commit—they aren't untracked and git status would not complain here, so listing tracked files in .gitignore has no effect.

In other words, tracked files can't be ignored. But remember, a tracked file is one that's in Git's index right now. How did it get there? Can you remove it?

The answer to how did it get there is probably: because it was in a commit. The other answer could be: because I added it.

The answer to can you remove it is yes, of course. All you have to do is run git rm. This removes both the index copy of the file, and your working tree copy of the file. Of course, if you take it out of Git's index, it won't be in the next commit. If it is in the current commit, and isn't in a new commit, the difference between those two commits includes the removal of that file.

There are two more features of .gitignore entries though:

listing files here means that if they're currently untracked-and-ignored, an en-masse git add, such as git add ., won't add them; and
listing a file here gives some Git operations permission to destroy the working tree copy of the file in some somewhat-unusual corner cases.

The latter is a good reason not to list a file in .gitignore if there's no reason to do so.

Conclusion

This folder [named raw] is meant to be the test-branch's folder and I want it to be removed from the directory when I switch to master branch. How can I make [Git do that]?

Suppose you check out test, so that the tip commit of test is the current commit, and the files named raw/*—that your OS insists go into a folder named raw—are contained in that commit and are therefore currently tracked, i.e., in Git's index. Suppose that the tip commit of master lacks those files. Then git checkout master will remove those (tracked, thus not ignored) files.³

You could list raw/ in a .gitignore but this would be entirely unhelpful: the files should already be absent when you have the tip commit of master checked out, and should come back when you have the tip commit of test checked out. Those files are in various commits that are find-able from branch name test but not from branch name master; those files are absent in the commits find-able from branch name master. (Any shared commits, on both branches, should lack the raw/ files.)

Note that if you ever merge some of these commits, you may need to merge with --no-commit and fix up the merge result. Merging—depending on the kind of merge, anyway—also makes commits "more findable", so the direction of the merge may become very important. We'll leave this for other questions-and-answers, though.

³If Git removes all the files within some folder in your working tree, Git will generally remove the folder as well. Note, though, that since Git doesn't actually store folders, it does not matter whether you have an empty folder in your working tree. In fact, since Git builds commits from the files in Git's index, it only matters which files (complete with embedded slashes in their names) are in Git's index.

You don't normally look at what's in Git's index, though if you want to, try running git ls-files --stage. Since this lists every file that will be in the next commit, its output is usually large and not useful. The git status command compares the index to both the HEAD commit, to print the "staged for commit" list, and to your work-tree, to print the "not staged for commit" and "untracked" lists, and this is generally more useful than looking directly at the index.

score 0 · Answer 3 · answered Dec 27 '20 at 17:24

It seems like you have committed it to the master branch or you have not committed it at all and it is local changes.

In case of the first scenario, checkout the master branch and delete the raw folder and commit. Now switch between the branches. You should see the raw folder in test branch but not in master branch.

In case of the second scenario, checkout test and commit your changes where you add the raw branch and .gitignore.

Another possible scenario is you added the raw folder to the master branch before you updated the .gitignore. Checkout master, delete the raw folder and commit. Switch back to your test branch and add the raw folder, update the .gitignore, and commit.

I commited all changes and the `test` branch is clean. When I switch to `master`, the `git status` tells me about `untracked raw folder`. — Paweł, Dec 27 '20 at 17:29
https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository — Bro3Simon, Dec 27 '20 at 19:32