10

I have found a way that git will not ask you to stash but instead silently delete your files that you believed to be safe in .gitignore. This is true even if you had the same ignore file since your initial commit.

The problem occurs when you are in a commit that has a said file removed but listed in .gitignore and then you checkout another commit where it existed:

git init
mkdir ignored/
echo stuff > ignored/file
echo otherstuff > otherfile
git add -A
# Opps added my ignored folders files
# Forgeting to rm --cache
echo ignored/ > .gitignore ; git add .gitignore
git commit -m 'accidentally add ignored/file'
git status
touch dummyfile; git add dummyfile
# Remembered to rm --cache
git rm --cache -rf ignored/file #This file is now vulnerable to clobber
git commit -m 'add stuff'
echo somechange >> ignored/file

## Wait for it..
git checkout HEAD~
## somechange has been silently clobbered!!

# Please paste first paragraph, observe and then past the second.
# Note both commits have the correct ignore file and are not immune!

(cd to an empty folder before pasting above code into terminal)

Is there anyway to prevent this silent clobber?

sabgenton
  • 1,823
  • 1
  • 12
  • 20

1 Answers1

9

It's not a bug (or at least, the git developers don't consider it to be one).

The contents of .gitignore are not "files that should be ignored" or "path names that should not be touched"; instead, they're "paths that don't get automatically added, and are suppressed from being shown as untracked" (which makes .gitignore a poor name).

Once you've committed a file, or even added it to the index, it's no longer ignored, regardless of whether it is listed in a .gitignore.

For some files, you can use git update-index --assume-unchanged or (better) git update-index --skip-worktree, but in general, if you've accidentally committed a file you should not have, and you want it ignored, you must "rewrite history" to take it entirely out of the repository to get good behavior. That's not too difficult if you haven't pushed anything and have few commits that contain the unwanted file, but much harder if you have pushed, or have many such commits.

See also Git - Difference Between 'assume-unchanged' and 'skip-worktree'.

[Text below added December 2016]

Technical details

For Git, there is a short, simple, and sweet definition of a tracked file: A file is tracked if and only if there is an entry for it in the index.

The "assume unchanged" and "skip worktree" things are flags you can manually set or clear in the index. They are separate flags and can be set individually, although it's not clear what it means to set both. The intent of "assume unchanged" is merely to make Git faster, by allowing it to assume that the file is not changed and hence does not need to be updated by git add, while the intent of the "skip worktree" flag is to tell Git "hands off": don't just assume it's unchanged, try hard to keep it unchanged. That's harder than it may look at first blush; for more about this, see the above linked post.

In order to set these flags, the file must have an index entry, so it must be tracked.

If a file is not tracked, it may or may not also be ignored. There are numerous sources for these: not just the top-level .gitignore, which contains a list (one per line) of a variant of Git's pathspec that is limited to glob matching and negation, but also .gitignore files within each sub-directory within a repository, the file $GIT_DIR/info/exclude if it exists, and the file named by core.excludesFile if that configuration entry exists. To find whether and why a file is ignored, use git check-ignore, available since Git version 1.8.2. The -v option tells you which control file marked the file as ignored.

Unfortunately, as in the question to which this is an answer, "ignored" has two meanings. While matching an ignore path keeps Git quiet about the file being untracked, it also makes Git feel free to clobber the file.

Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775
  • Dooh, as I thought. So basically it's `.gitignore`, scary as usually git floats work across when you checkout commits. A lot of people depend on this (and shouldn't). – sabgenton Jul 10 '15 at 23:13
  • Yes, there really should be some way to at least get git to warn about this. It's not quite clear what that way should be though. Any *other* file that is recorded as "exists in rev X, but not in rev Y" should be silently removed on going from rev X to rev Y. Perhaps a new directives file, `.gitprecious` for instance, could be created, and files that should not be removed could be listed here. This complicates `git checkout`'s job, it now must extract the target commit's "precious list" *first*, then decide what files can be clobbered, but it would provide a sort of life jacket. – torek Jul 11 '15 at 00:42
  • wouldn't `--skip-worktree` be the thing to use? I thought --assume-unchanged was supposed to be for when the file is not currently changed? – sabgenton Jul 17 '15 at 08:20
  • 1
    @sabgenton: I haven't tried it, so I can't really say which one to use. There is a lot of git mailing list discussion (from around 2010 as I recall) on how ignored files perhaps should be divided into two categories: files that git should not check in *and* feel free to clobber, vs files that git should not check in *but* should *not* clobber ("precious" files). It's never been properly implemented, though. – torek Jul 17 '15 at 18:03
  • "In particular, git rm or git rm --cached writes a special "to be deleted" entry into the index," . . . I didn't think it worked this way and don't see it in the code, can you explain what this is referring to? – jthill Dec 10 '16 at 23:42
  • @jthill: Hm, it *used* to, I remember seeing the special "removed" entries. Let me test something... ... you're right, this is no longer the case (if it ever was, maybe I'm remembering a different VCS :-) ). – torek Dec 11 '16 at 00:50
  • I see the `CE_REMOVE` bit, but `git grep CE_REMOVE` shows a comment in `cache-tree.c` saying entries with that bit are simply not written; I think it's a way of avoiding O( N^2 ) removing multiple entries one at a time. – jthill Dec 11 '16 at 00:53
  • @jthill: Updated now, I took out the extra footnote. It might be the case that `CE_REMOVE` entries were written back to the index file at one time. I can't reproduce it now of course, but besides the bit itself, I remember some sort of observable behavior that implied the index entry was still there. – torek Dec 11 '16 at 00:57