4

If I understand correctly, git stash -u stashes everything in your working directory including untracked files and puts your working directory in the state it was in after the last commit i.e position of HEAD.

But when I ran it, it got rid of all untracked files except one folder (also removed all the tracked and modified files from my working directory of course). That particular folder is not in any of my .gitignore before or after stashing. It was showing as untracked before I ran the command and after. Is this a bug or is there a reason why this happened?

For further details, I had copied the folder from somewhere else in the same repo but I had done the same with several other folders which got stashed and removed. My goal was to stash all my local changes both tracked and untracked so that I could get a working directory in exactly the same state as the latest commit.

Skawang
  • 141
  • 1
  • 4
  • Do you still need the changed files, where you want to put in the `stash` later? Then your goal is, that your working directory looks same as the last commit. So if you don‘t need the changed files since the last commit, run `git clean -fd` (`f` for files and `d` directory). ATTENTION: This changes were away **forever**! – SwissCodeMen Feb 07 '21 at 08:13
  • I might need those files later, that's why I decided to go for `git stash -u`. At the same time I don't think they're in a state where I can make a commit. My confusion basically is why did `git stash -u` leave an untracked folder in my working tree? – Skawang Feb 07 '21 at 10:52
  • It's difficult to help without further information. You say the *folder* that is **not stashed** is not in any `.gitignore` in this repository. Have you tried the following command: `git stash -u --all` to stashed also all changed files where are in a `.gitignore`? Maybe the folder in a `.gitignore` is ignored after all. – SwissCodeMen Feb 07 '21 at 12:30

1 Answers1

1

TL;DR

But when I ran it, it got rid of all untracked files except one folder (also removed all the tracked and modified files from my working directory of course). That particular folder is not in any of my .gitignore before or after stashing. It was showing as untracked before I ran the command and after. Is this a bug or is there a reason why this happened?

This depends on what is in this folder. Git never stores folders in the first place, so if the folder is empty, that's one reason it would remain; but Git never bothers to mention empty folders either, in git status, so presumably you mean a not-empty folder.

The whole thing is a little bit of a mystery. Perhaps you can solve it.

Long

It is, I think, helpful to start with what git stash does when used without the -u or -a options (which both have longer spellings: --include-untracked and --all respectively). What this kind of git stash does is:

  1. commit whatever is in Git's index: this is the same commit that git commit would make, except that git stash makes this commit on no branch, rather than on the current branch; then
  2. commit whatever files are in Git's index after using git add on all of them, i.e., the equivalent of git add -u && git commit: this is similar to the commit you'd get if you did just that, except that git stash also makes this commit on no branch, rather than on the current branch, and for internal reasons, makes the rest of Git think of this commit as a merge commit.1 Then, git stash will
  3. run git reset --hard, to reset both Git's index and your working tree.

This last git reset --hard erases all staged changes and all unstaged changes.2 That's OK to do because everything you need to restore those changes, staged and/or unstaged, is in the commits that git stash made in steps 1 and 2.

To really understand this, though you need a good grasp of what Git's index is all about, so that points 1 and 2 make sense. Let's take a moment to define this, since so many Git tutorials are bad about it. (If you already know exactly what the index is and does for you, feel free to skip the next section.)


1If all of Git considers this commit to be a merge commit, then perhaps it is a merge commit. But it wasn't made using any of Git's merge machinery. So is it a merge commit, or not? Does it matter? If so, when does it matter? These are all useful questions to ponder, but only after you have a lot of experience with the rest of Git.

2Git doesn't really deal in changes, but rather in snapshots. The difference is simple enough: given two snapshots, we can find changes, but given a change, we can't find any snapshots. Think of it like the weather forecast. If they tell you it will be ten degrees warmer or colder today than it was yesterday, what temperature was it yesterday and what temperature will it be today? Can you even answer this question? But if they tell you it was 20 yesterday and will be 30 today, what temperature was it yesterday, what temperature is it today, and how different are those two temperatures?

Two snapshots will get you a difference. A difference won't get you any snapshots. That's why Git deals with snapshots. But as humans, we're often more interested in the differences. That's why Git tells you the differences.


The index aka staging area aka cache

Git's index is a very big deal in Git, and you really do have to know about it. There are ways to try to avoid learning about it—such as using git commit -a—but skipping over this is doing yourself a disservice, if you're going to use Git. The problem here is that Git, being a Monty Python fan, will come out now and then and slap you in the face with a fish made of its index, at least metaphorically; you need to be able to retaliate.

The index, in Git, has numerous roles, and it gets expanded during conflicted merges (after which you literally can't commit). We won't go into those details here, though I'll mention that the process of resolving the conflict is essentially the process of whittling the index back down to size (after which you can commit again). Instead, we'll concentrate on how Git uses the index to "track" files, and how this means that what's in Git's index is what goes into a commit.

The index also has three names. This reflects its important role, or the fact that "index" is a terrible name, or both. The other two names are the staging area, which refers to how you use it, and the cache. The last name is not used very often today: it mostly shows up in flags, like git rm --cached. But what is the index?

The best way to explain the index, I think, is by example. If you run git ls-files --stage, Git will dump out the entire contents of the index as it exists right now in whatever Git repository you're working in. Here's a snippet out of the index in a Git repository for Git:

$ git ls-files --stage | sed -n -e 10,19p
100644 cbeebdab7a5e2c6afec338c3534930f569c90f63 0       .gitmodules
100644 bde7aba756ea74c3af562874ab5c81a829e43c83 0       .mailmap
100644 05f3e3f8d79117c1d32bf5e433d0fd49de93125c 0       .travis.yml
100644 5ba86d68459e61f87dae1332c7f2402860b4280c 0       .tsan-suppressions
100644 fc4645d5c08bd005238fc72cfa709495d8722e6a 0       CODE_OF_CONDUCT.md
100644 536e55524db72bd2acf175208aef4f3dfc148d42 0       COPYING
100644 ddb030137d54ef3fb0ee01d973ec5cee4bb2b2b3 0       Documentation/.gitattributes
100644 9022d4835545cbf40c9537efa8ca9a7678e42673 0       Documentation/.gitignore
100644 45465bc0c98f5d88cfe1ade092d29b5dc32c1e23 0       Documentation/CodingGuidelines
100644 b9804070594d9cd33dfc1e30cdd925c6e83a2187 0       Documentation/Makefile

If I run git status right now, I get nothing to commit, working tree clean. Yet the index is full of files. What the index contains—as shown above—is, for each file, four things: a mode string, usually 100644, a hash ID, a staging number—this needs to be zero for you to be able to commit, and except for mid-merge, always is zero—and a file name, such as COPYING or Documentation/Makefile.

These are the files that will go into the next commit you make. That's really it: it's that simple. The mode, which is 100644 for a regular file and 100755 for an executable file, is how Git will restore the file mode later, on a system with executable files.3 The file name—complete with embedded slashes; there are no such things as "folders" in the index—tells Git what to name the file, and the hash ID represents the de-duplicated file contents as they'll appear in the commit's snapshot-of-all-files.

When you first check out some commit, Git fills in its index with all the files in the snapshot that is in that commit.4 Then Git fills in your working tree with the same files. So now you have, in both Git's index and your working tree, all the files from that commit.

When a file is in Git's index, Git calls that file tracked.5 If you have a file in your working tree that isn't in Git's index, that's an untracked file. Your working tree is an ordinary directory (or folder, if you prefer that term) on your computer. In a very real sense your working tree isn't in the Git repository at all.6 So you can create and destroy files and folders at will here. Git has no knowledge of this, and no influence over it. Later, you might run git add, for instance, to tell Git look at my working tree now. Or you might not!

When you run git commit, Git uses the files that are in Git's indexnot the files in your working tree!—to make the new commit. So if you git checkout some commit, or use git switch to check out some commit, that fills in Git's index with that commit, and fills in your working tree too. If you then modify stuff in your working tree, but don't run git add, and try to git commit, Git tells you that there's nothing new to commit. What Git just did was to compare the files in Git's index to the files in the current commit. They were all the same! You may have changed your working tree, but nothing happened in the repository. So there was nothing new to commit.

When you run git add, you are telling Git: Look at my working tree files now. In particular, you tell Git to make, in its index, its copy of each file that's ready to be committed match the one you've updated—or maybe created from scratch—in your working tree. If you made an all-new file, Git copies that all-new file into its index. If you updated some existing file, Git copies the updated contents into its index. Either way, you've now made Git's index copy match your working-tree copy.

In the end, then, the index functions as your proposed next commit. You stage a file by copying it, up (or down) onto (into) the stage (staging area): arranging it for a new snapshot, as if you were moving furniture and dummies around on a stage, making photographs to advertise a play, perhaps. You can stage something, then stand back and look at the photo you'd get. If you like it, you actually take the photograph now, by running git commit. If you don't like it, you fiddle with the working tree copies of files, run git add again, and stand back and look again.

The key points here are these:

  • The index acts as your proposed next commit. There are copies of files in the index all the time. They initially come out of some commit. They get updated when you run git add. The git commit command snapshots whatever's "on stage". The git status command lists files as staged for commit when the copy that's in the index is different (as compared to the copy in the current commit).

  • Files can exist in your working tree without ever going "on stage". These untracked files won't be in a future commit. It's likely that they don't exist in the repository at all.7

  • The index only holds file names, not directory names. This is why you can't commit an empty directory: Git builds commits from the index, and the index holds only files. The names can have slashes in them—Documentation/Makefile for instance—and the host system may require that this exist in your working tree as a directory (folder) holding a file named Makefile, but in the index, it's just a file named Documentation/Makefile.

Since git commit is only going to commit the files that are listed in the index, any stuff you do with untracked files is irrelevant. And, since git commit is going to commit what's in the index, any stuff you do with tracked files only matters if you git add those files.8


3On systems that don't support this stuff, Git still keeps the mode around in the index, so that it can go into the commits. Git will notice that the system doesn't support file modes and will set core.fileMode to false; to change modes in Git's index, you will then have to resort to Git's low-level update-index command. If you're on Linux or macOS this is normally not a problem at all. Samba can emulate executable permissions to some extent even on Windows systems, too, although since I avoid Windows I'm not sure how well this works in practice.

4I'm deliberately glossing over a lot of fine details here: Git has the ability to not fill in the index for certain files in certain cases, so as to let you switch commits while you have uncommitted work. This idea—of switching commits while carrying uncommitted work around—is tricky and full of corner cases. For all the gory details, see Checkout another branch when there are uncommitted changes on the current branch.

5Technically, Git only defines the term untracked file, but if a file that isn't in the index is untracked, it stands to reason that a file that is in the index must be tracked.

6The repository itself is—usually, at least—in a .git subdirectory of the top level of your working tree. The stuff inside that .git folder—all the files and directories therein—is the repository, and is under Git's control; the stuff outside of it isn't under Git's control, and therefore is not part of the repository.

7Technically, an unstaged file just doesn't exist in the proposed next commit. It could exist in some other, existing commits. It's pretty common for untracked files to not be in any commits, though.

8Note that git commit -a makes Git automatically run a git add step during the git commit step, but this add only adds the files that are already in the index, so you're still stuck with doing a git add on any new files. While git add -a is pretty attractive at first—at first, it makes using Git as easy as using Mercurial, for instance—it will eventually let you down, if you try to use it to avoid learning about the index. We'll see that in the very next section, about git stash -u.


The -u and -a options

Adding either -u or -a to git stash—more precisely, to a save or push sub-command—tells Git to add, to the two commits on no branch that normally make up a stash, a third commit. This third commit is not properly described in the documentation. The documentation has a brief section titled DISCUSSION that talks about saving the index and working tree files in I and W commits (which I normally render in lowercase in my own descriptions) but it never mentions the U commit that is added for these kinds of stash operations.

We need to start with the First Rule of Commits: Git's commits are always made from an index.9 Note that this says an index, rather than the index: we can point the internal bits of Git to a temporary index, which we make however we like. For instance, we might copy the index—the main / real one—to a new temporary index, then fuss with the temporary index. That's how the old git stash program worked, when git stash was still a shell script. It's been rewritten as a C program, which still does this, it's just that now it's harder to see how it does this.

This means that each of the two or three commits that hold a stash—i, w, and u if we make u—are made from an index, and therefore can only hold files. They cannot hold a directory. The files they hold can have slashes in their names, but like all commits, there are no directories, just files with slashes in their names.

When using -u or -a, git stash will still make the i and w commits as usual, and will still use git reset --hard as usual. But before it finishes up, git stash will enumerate some or all untracked files. This is also where the difference between -u and -a comes in.

We can group untracked files into two groups. There are untracked but not ignored files, where git status will whine about these files; and there are untracked and ignored files, where git status will say nothing.10 The -u option makes a list of the whined-about untracked files, while the -a option makes a list of all untracked files, even if the whining is to be suppressed.

Having made this list of (some or all) untracked files, git stash push -u or -a now creates this third u commit. It does so by:

  1. creating an empty temporary index;
  2. git add-ing to the temporary index, all the files listed;
  3. git commit-ing this as the special u commit; and
  4. removing these files from your working tree.

That's it. That's really all that's involved.11 Now that the triple-commit stash exists, after the git stash -u finishes, your untracked files are gone from your working tree. But there's one gap left.


9It's possible to make one without using an index, using git mktree, but that's not how it's done in practice. Instead, internally, commands use git write-tree, which turns the—or at least an—index into a snapshot, and then use git commit-tree to wrap the snapshot up into a commit.

10Note that there is no such thing as a tracked-yet-ignored file: a tracked file is, by definition, not ignored. This means .gitignore is in a way the wrong name: it should perhaps be .git-do-not-whine-about-these-untracked-files. That's not sufficiently descriptive, but we'll leave that for other answers. This name is too long already (and the sufficiently descriptive file name is utterly ridiculous), so that's why Git calls this file .gitignore.

11The code to make this happen was so convoluted, originally, that it was wrong when doing git stash push -u on partial trees, when the partial-tree stash feature was added. This would irrecoverably remove files that weren't put into the u commit. That's long since fixed, but it's one of many, many reasons I recommend avoiding git stash as much as possible.


How Git cleans up, and the mystery here

When you run git reset --hard or git clean—both use some of the same code internally—Git will sometimes remove some files from your working tree. And, as we noted earlier, Git never stores folders in the first place. Instead, Git has, in commits,12 files with slashes in their names, like Documentation/Makefile.

What Git does with this is that when it needs to create a file in your working tree, it tries to do so, and if that fails because the OS rejects the create new file request with an error of the form there's no directory named Documentation in which to put a file named Makefile, Git just goes and creates the directory.13 In effect, Git just makes the new folders whenever necessary. The OS demands they exist, so Git satisfies the OS.

This means that in order to clean up after itself, Git should remove directories / folders. So Git does that: if Git is removing the last of its own files, like Documentation/.gitattributes, from a folder that it probably made on its own earlier, like Documentation, Git will try to remove the now-empty folder. But there's a complication here: what if you made some untracked file in there, while Git wasn't looking?

Git's answer to this is simple: just don't complain if the removal fails. The OS will reject the attempt to remove the Documentation folder if it's not empty. If the OS rejects the attempt because it's not empty, well, shrug ‍♀️ ... and move on.

When using git clean, the -d option controls whether Git should even look into untracked directories. At least, that's how it's documented. The thing is that there is no such thing as an untracked directory. Instead, there are only untracked files. When using git status, Git finds all the untracked files and complains about them. It's just that git status summarizes many files that are all within some sub-directory by saying that the directory is untracked. To get more information from git status, use the -uall option to prevent this kind of summarization. Since the cleaning that git stash does is hidden, we shouldn't have to worry about whether you're using the script version of git stash or the C-coded version, and any -d options. But do make note of the -uall option for git status.

Ultimately, the two interesting questions here are these:

  • Did Git put the files from that "untracked directory" into the u commit?

    To find out, list the contents of the u commit: git ls-tree -r stash^3. The stash^3 notation is gitrevisions syntax that will find the u commit in the stash. The git ls-tree -r will show all the file names in that commit.

  • Did Git fail to remove some of these files? If so, is that because of a permissions issue that Git silently ignored? Or was Git unable to save these files because the subdirectory itself contains a Git repository? Git cannot save a Git repository inside a Git repository: that's forbidden for security reasons. So an "untracked directory" that's the top level of a Git-repo-and-work-tree pair will lead to this kind of thing.

Only you have the file system (and Git repository) in question, so only you can answer these.


12Technically, in the committed form, the names are stored with folder-like structures. It's the process of extracting the commit to the index that strings the names together with slashes. If an index could hold a directory, a lot of this stuff might change. Unfortunately there are technical issues with storing a directory name in an index (because Git has done something else with that—the index's internal format is complicated).

13This glosses over a lot of internal issues. In particular when we're moving from one commit to another, we could have a file named Documentation where we need a directory named Documentation/ to hold files, or vice versa. Git looks out for these in advance because it becomes important to issue the right operations to the OS at the right time. A directory/file conflict, or "d/f" conflict, makes Git shuffle the order of operations, between removing files that need to go away—because they were tracked files that aren't in the commit we're moving to—and creating files, because they will be tracked files in the commit we're moving to.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thanks for the comprehensive answer. Being a beginner to git, I learned some things that I didn't know about before. There's no git repository inside the folder. I don't see any reason why it would be different from the others. Edit: I ran `git ls-tree stash^3` and it looks like the files in that folder are in the stash commit(I had ran it earlier in the wrong directory earlier). This means it's doing it's job right? So why didn't the `git reset --hard` part work? – Skawang Feb 09 '21 at 13:30
  • Note that `git reset --hard` itself doesn't touch any *untracked* files. (Not being in the [main] index means there is nothing to do to / with them.) The mystery is why, after putting those files into the extra `u` commit, the step that's then supposed to *remove* those same files from your work-tree, failed to remove at least one such file. (This assumes you've carefully collated the `git ls-tree -r stash^3` output against the lingering untracked files, and found that there are lingering untracked files that are in that commit. Otherwise, the mystery is why they're not in the `u` commit.) – torek Feb 09 '21 at 15:53