7

I recently did a git rm of the last file in a directory(a) and, for some reason, it decided to delete the directory as well. I tested the behaviour with a new file as follows:

mkdir newdir
touch newdir/newfile
git add newdir/newfile
git commit
git rm newdir/newfile

When I do that last line, the newdir directory disappears altogether. Is that expected behaviour? My understanding was that Git tracked files, not directories.

Since it doesn't complain after step one above, the creation of a directory with no tracked files, why is it deleting a directory just because I remove the last tracked file from it?

For what it's worth, I'm running version 2.7.4.


(a) I had a single placeholder directory with a gitDummy file so that it was pushed to the repo. I then had a whole lot of real files that I wanted to add so I removed the dummy then tried to copy in the new files in preparation for adding, committing and pushing.

Lo and behold, the copy operation failed because the directory had gone. I suspect it would have worked if I'd copied in the files before deleting the dummy but it still strikes me as strange that Git removes directories when it shouldn't care about them.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Reproducible on v2.17.1 as well – zerkms Apr 03 '19 at 04:02
  • I could reproduce the same in 2.20.1. However, if I add test/.gitkeep the directory stays and git doesn't remove it. Maybe it is strange that git removes the local directory, but since it would be empty anyway and git will ignore it anyway, I feel like it's doing me a favour and remove it all together so that I don't get confused by it later. The right way to maintain an empty folder in GIt is with a .gitkeep file – Mihai Apr 03 '19 at 04:19
  • Since `git rm` stages changes, I think it makes sense for it to put your working tree in sync with the state of the repo according to those changes. (If someone pulled that, it would delete their copy of the directory, after all.) – Ry- Apr 03 '19 at 04:26
  • TLDR; of my long answer: yes, Git is supposed to remove empty folder where the last tracked content is deleted. – VonC Apr 03 '19 at 05:05
  • As a side note, I have seen rare occasions when Git *doesn't* delete a directory that ultimately has no files inside it (e.g., after `git checkout`). I have never been able to reproduce this, so I'm not sure what causes it. – torek Apr 03 '19 at 05:17
  • Note: see https://stackoverflow.com/a/70612805/6309: this should be better with Git 2.35 (Q1 2022) – VonC Jan 06 '22 at 19:49

1 Answers1

8

2022:

This should no longer be an issue with Git 2.35 (Q1 2022)

No more "fatal: Unable to read current working directory: No such file or directory"


2019: original answer:

This is followed in this June 2018 thread, where it is reported as a "git rm bug"

TLDR; it is not a bug.

No Git command should behave in such a way as to leave the tree in a state when moving from commit X to Y that you wouldn't get the same Y if you re-cloned.

On to the thread:

OVERVIEW

"git rm" will remove more files than specified. This is either a bug or undocumented behavior (not in the man pages).

SETUP

  1. In a git repository, create an empty directory OR a chain of empty directories

    $ mkdir -p path/to/some/

  2. Create a file in the deepest directory and add it to tracking

    $ touch path/to/some/file $ git add path/to/some/file $ git commit -m 'add path/to/some/file'

THE BUG

Run 'git rm' on the tracked file.

EXPECTED BEHAVIOR

$ git rm path/to/some/file
rm 'path/to/some/file'
$ ls path
to/
$ ls path/to
some/

Note that path/, path/to/, and path/to/some/ still exist.

ACTUAL BEHAVIOR

$ git rm path/to/some/file
rm 'path/to/some/file'
$ ls path
ls: cannot access 'path': No such file or directory

The entire chain of empty directories is removed, despite the fact the git outputs only "rm 'path/to/some/file'".

This ONLY occurs when all the directories in the chain are empty after the tracked file has been removed.

This behavior is NOT documented in the man pages.

I propose that 'rmdir' statements are added to 'git rm' output, or that the man pages be updated to reflect this behavior.

The general principle is:

Git cannot track empty directories.
As that was the only content in that whole hierarchy, the entire hierarchy had to be deleted.

It looks like this behavior has been in place for many years, since d9b814cc97 ("Add builtin "git rm" command", 2006-05-19, Git v1.4.0-rc1).
Interestingly, Linus noted in the commit message that the removal of leading directories was different than when git-rm was a shell script.
And he wondered if it might be worth having an option to control that behavior.

I imagine that most users either want the current behavior or they rarely run across this and are surprised, given how long git rm has worked this way.

It's also consistent with other parts of Git that remove files. E.g., "git checkout" to a state that does not have the file will remove the leading directories (if they're empty, of course).

More generally:

I am going to be contrarian and obstinate and suggest that the current behaviour is fine, since there is no compelling rationale for any other behaviour.

Invariably, every defense for hanging on to empty directories boils down to, "I might do something in the future that expects those directories to exist."
Well, if that's the case, then create them when you need them -- nothing you do should ever simply assume the existence of essential directories.

In addition, by "untracking" those directories, you're suggesting that Git quietly do what should normally be done by "git rm --cached".
If I want that behaviour, I would prefer to have to type it myself.

For example, to illustrate why not deleting empty folder would be problematic:

Others have said why, but here's an edge case you probably haven't thought of:

(
   rm -rf /tmp/repo &&
   git init /tmp/repo &&
   cd /tmp/repo &&
   mkdir -p foo/bar/baz &&
   git status
)

If you just have empty directories "git status" will report nothing, although "git clean -dxfn" will show what would be cleaned up.

So if this worked as you're suggesting then someone could git rm some file, then everything would report that they're on commit XYZ, but if they re-cloned at that commit they'd get a tree that would look different.

No Git command should behave in such a way as to leave the tree in a state when moving from commit X->Y that you wouldn't get the same Y if you re-cloned.


Note: the official git rm test (git/git/t/t3600-rm.sh) of the Git repo itself is quite clear as to what it expects:

test_expect_success 'rm removes subdirectories recursively' '
    mkdir -p dir/subdir/subsubdir &&
    echo content >dir/subdir/subsubdir/file &&
    git add dir/subdir/subsubdir/file &&
    git rm -f dir/subdir/subsubdir/file &&
    ! test -d dir
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 1
    This is a great answer. Thanks. – Sid Apr 03 '19 at 06:15
  • "No Git command should behave in such a way as to leave the tree in a state when moving from commit X to Y that you wouldn't get the same Y if you re-cloned." --- ".gitignore" easily breaks this promise. – zerkms Apr 03 '19 at 21:31
  • 1
    @zerkms kind of... but `.gitignore` is a directive, not a command. `git rm` is a command. And `.gitignore` is about *untracked* content. `git rm` removing folders is about *tracked* content coherency between commits. – VonC Apr 03 '19 at 21:53