1

The git filter-branch fails with "Cannot rewrite branch(es) with a dirty working directory message" if "git rm" command was run before it. In spite of "Cannot rewrite branch(es) with a dirty working directory message, git filter-branch command managed to erase the specified folder. Therefore, I don't understand what the error message really means and any impact on what I am doing.

I am on Linux with git version 1.7.1. Please see below for exact commands.

git clone ...
cd /home/userid/fpcnav_test
git rm -q -r -f  --ignore-unmatch olddir1
git filter-branch --force --index-filter "git rm -q -r -f --ignore-unmatch olddir2 --prune-empty --tag-name-filter cat -- --all

It seems that "git rm" makes working directory dirty which fails the git filter-branch command. I workaround this by running git stash (or got commit -m ...) command between git rm and git filter commands. My intention is to run series of "git rm" and "git filter-branch" commands but can't do it cleanly unless I do git stash between each git rm and git filter-branch. Is there a clean way to do this or not? Eventually, I will be running these commands against the original repository and not under cloned repository. Thanks.

vic99
  • 11
  • 2

1 Answers1

1

First, a warning: do not run filter-branch --prune-empty ... --all with active stashes: it tends to break them. (It probably works OK if you keep empty commits, since the breakage is caused by deleting an apparently-empty index commit of the special stash pseudo-merges. This is based on an answer I provided quite a while ago where someone had a corrupted stash after using filter-branch.)

That out of the way: of course git rm (if successful) dirties the work directory, since you are now ready to make a new commit that actually has those removed files removed. It makes no sense to do that just before doing your filter-branch (which has the same git rm as its filter—you may want to add --cached) since the filtering git rm will apply to the current commit too.

Remember that filter-branch is something like rebase on steroids: it makes copies of every filtered commit. Once the copies are made, filter-branch adjusts the specified references (branch names, and tag names if given a tag filter) to point to the copy version of the original commit (or, for commits deleted via --prune-empty or a commit filter that skips commits, the most appropriate copied commit).

Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775
  • "This is based on an answer I provided quite a while ago where someone had a corrupted stash after using filter-branch": http://stackoverflow.com/a/22079133/6309 I suppose? – VonC May 21 '16 at 09:20
  • That's the one! Thanks, linked now. – torek May 21 '16 at 09:56
  • I run git rm command apart from filter-branch's git rm because I would like to only delete the data and not its history. I have two separate lists of aged data that needs to be pruned. The one list ( (where data is older than 6 months) is for removal of the data and its history. And the other list (where data is older than 3 months but newer than 6 months) is only for removal of data and NOT its history. – vic99 May 22 '16 at 09:27
  • "I would like to only delete the data and not its history": there is no such thing. There is no "history of data". History is recorded in *commits*. `git rm` modifies the index (and optionally work-tree as well) to prepare for making another commit, and it is making commits that makes history, in Git. – torek May 22 '16 at 11:19
  • I would like to remove the data and all of its history if the data that is older than 6 months, so I use git filter-branch 'git rm ..' command. In a situation where the data is older than 3 months but newer than 6 months, I use git rm for cleanup so I can recover the data at later time, if I decide to do so. My understanding is that git rm removes the data and not the history; whereas, the git filter-branch 'git rm ..'' command permanently removes the data and its history. Please confirm if my understanding is right or not – vic99 May 23 '16 at 00:44
  • No, that's not right. "History" is commits: nothing more, nothing less. `git filter-branch` copies commits (making changes before making the copies), so you had one history, now you have two. Stop looking at the old history—pretend it's gone—and you now *think* you have a different history, but in fact you just have two histories. Eventually, if you remove all references to the old commits, Git will garbage-collect them, and you really will just have the one history. But in any case `git rm` just prepares for the *next* commit to be different. – torek May 23 '16 at 01:22