0

So when I run git log -S[string] or git log -G[string], git will try to search [string] in various places over all the log. It seems to encounter some old .docx files that I had committed once but since then deleted. Somehow git isn't happy with that though, since it says either :

<[filename].docx> does not seem to be a docx file!

or

Failed to extract required information from <[filename].docx>!

In my particular situation, however, I have never encountered a situation where I would expect results of my -S/-G search to pop up in those files. Can I just ask git to skip them when searching, thus saving time (the search is noticeably slower) and also sparing me the error messages ?

Charles
  • 988
  • 1
  • 11
  • 28
  • 2
    You mean `--no-textconv`? – matt Mar 24 '22 at 13:02
  • I do, quite exactly ! Thank you ! Feel free to post this as an answer and I will mark it as accepted ! – Charles Mar 24 '22 at 14:40
  • Though actually, the title of my post was "ignore files", not specifying which kind, and that would also be an interesting question ! In case you happen to know the answer off the top of your head like this. – Charles Mar 24 '22 at 14:41

1 Answers1

0

What's happening here is that you have told Git that, while *.docx files are binary and hence not generally readable,1 Git can convert such files to plain text, for diff and/or grep purposes, by running some filter program. Unfortunately this particular filter program can't read these files, perhaps due to format changes.2 (It is this program, not Git, that is emitting the error messages.)

... files that I had committed once but since then deleted

You haven't deleted them (you literally can't delete them): they're still there in the old commits. Your git log is looking through all the commits, including the old ones.

Can I just ask git to skip them when searching

You can: git log takes an optional pathspec at the end, and the pathspec can be a negative pathspec, i.e., a pattern that Git should not look at. In this case :!*.docx would most likely work as this negative pathspec. Note that some of these characters are special to various shells, so they may need some tricky quoting.

Apart from this, as matt noted in a comment, you can add --no-textconv to your git log to ignore textconv directives in .gitattributes files. You might also reconsider whether you want to use these textconv attributes at all, if they don't work on whatever docx files you have.


1Some docx files are binary due to being compressed, and some are just difficult to read.

2There's no single standard docx format. See Wikipedia for details.

torek
  • 448,244
  • 59
  • 642
  • 775