0

I am trying a add to keyword to all the commits affecting a given part of my repository. I tried something like this:

git filter-branch --msg-filter \
'value=$(git ls-files -s | grep -v "folderX" | grep -q .) \
([[ $value -eq 1 ]] && echo "[flag]" && cat) || cat' HEAD

but it doesn't work. It seems that git ls-files -s returns empty. I am not sure to understand what is the state of the repository during the filter-branch operation, and what I should expect from git commands inside the filters.

Does somebody knows a way to achieve this?

Romain
  • 33
  • 4

2 Answers2

1

The first problem here is that the --msg-filter is run with no index in place:

if test -n "$filter_index" ||
   test -n "$filter_tree" ||
   test -n "$filter_subdir"
then
        need_index=t
else
        need_index=
fi

Since you have only a $filter_msg—in fact, git filter-branch always has one, it's just that it's cat by default—this leaves need_index set to the empty string, so then later, as filter-branch is iterating through every commit to be copied (this, like the above, is snipped from git-filter-branch.sh):

while read commit parents; do
    [snip]
            if test -n "$need_index"
            then
                    GIT_ALLOW_NULL_SHA1=1 git read-tree -i -m $commit
            fi
    [snip]

Since the tree for the commit has not been read into the index, git ls-files (which reads the index) finds nothing.

The simplest method is to use git ls-tree -r $GIT_COMMIT_ID instead of git ls-files.

This gets us to the second problem:

value=$(git ls-files -s | grep -v "folderX" | grep -q .)

I am not sure what you are trying to test here, but grep -q never produces any output, only an exit status. Thus, value will always be set to the empty string.

Also, there seems to be no reason to inspect the stage numbers and hashes (-s). The stage number would always be zero if we had an index, since, despite the -m option, we're not actually merging any trees here. You said:

... [flag] all the commits affecting a given part of my repository

and "affecting" usually means "modifying", which means "modifying with respect to something" (since each commit is a snapshot, it must be compared against some other commit in order to see what changed).

The choice for what to compare against is easy for most ordinary, single-parent commits: we just compare it against its (single) parent. It's less obvious for root commits, but we usually want to compare them against the empty tree (see Is git's semi-secret empty tree object reliable, and why is there not a symbolic name for it?). It's least obvious for merges: do we compare against first parent, or all parents? If all parents, do we consider a file modified if it's changed against any of them, or only if it is changed against all of them?

In any case, you will probably want to do something more complex here than just test whether some file(s) exist in the snapshot. Whatever that test turns out to be, you probably want your message filter to be simpler once the test completes:

--msg-filter 'some-test-here && echo -n "[flag] "; cat'
Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775
  • thank you for your answer. I ended up using `git diff-tree --no-commit-id --name-only -r $GIT_COMMIT` which also works without index – Romain Feb 12 '17 at 13:51
  • Note that `git diff-tree` compares the commit against all its parents, and for merges, considers a file modified if it differs from every parent. For a standard two-parent merge that means file F is changed if either: (a) both parents contributed a real change, or (b) the committer made a so-called "evil merge". This is usually a pretty good criterion for paying attention to the merge, hence is probably right. :-) – torek Feb 12 '17 at 15:24
-1

Try to debug process. Use something like this:

echo $value >&2

The >&2 will redirect to error stream and will appear on console.

oklas
  • 7,935
  • 2
  • 26
  • 42