61

I'd like to find commits in my code base that add video files to throw them out. Is there a way to look for these files in git ?

For example let's say all videos have a filename ending with the extension .wmv ; I'd like to find all commits introducing these files and get rid of them with a fixup or something.

Any ideas ?

Bastes
  • 1,116
  • 1
  • 9
  • 18

7 Answers7

64

You can use git log with a pathspec:

git log --all -- '*.wmv'

This will get you all commits which make changes to .wmv files. yes, this will descend into subdirectories too (but you have to surround your pathspec with single quotes, so it will be passed as is to git).

If you are only interested in commit hashes (scripting etc.) use the git rev-list machinery directly:

git rev-list --all -- '*.wmv'

Under Windows, it might be required to use double quotes instead of single quotes around the pathspec, e.g. "*.wmv"

knittl
  • 246,190
  • 53
  • 318
  • 364
  • 5
    That didn't seem to work at all. I've got nothing after these two commands and a long wait. – Bastes Jul 01 '11 at 12:24
  • @bastes: then there are n.o wmv files in your trees … @adl's answer is better though – knittl Jul 01 '11 at 14:02
  • 2
    No this doesnt work. I try git log --all -- '\*.java' and get nothing while find . -name '\*.java' list all the files. – misiu_mp Mar 12 '12 at 14:41
  • @misiu_mp: It does work (I just tried it in my git.git clone: `git log -- '*.sh'`). It finds files in subdirectories as expected (namely, all scripts in `/t/*`), when the wildcard is passed verbatim to the git commands. `find -name '*.java'` will list all files currently in your working copy, it will not walk the repository's history. `git log` will only work for tracked files. – knittl Mar 12 '12 at 14:48
  • 1
    Damn, I must have had a broken version because recompile of the newest 1.7.9.3 works just as you describe. I swear to god though, it didn't work. – misiu_mp Mar 12 '12 at 14:59
  • For the record, I had 1.7.4.4. – misiu_mp Mar 12 '12 at 15:07
  • That explains why it's not working for me, thanks >. – MSpreij Mar 29 '16 at 13:34
  • Note that your wildcard(s) should cover the (leading...) file-path part of the description, if you are not looking in a specific directory. eg. "\*/backup-\*.sh" looking for all backup-related scripts. – MikeW Mar 16 '17 at 11:08
  • You can pass multiple patterns after the two dashes: `-- '*.wmv' '*mp4'` – Ben Nov 08 '19 at 16:13
  • I cannot get this to work! – gyozo kudor Oct 18 '21 at 08:53
  • @gyozokudor you need to provide a lot more information than that to get any useful replies. Which command did you use? What are your file paths? What is your Git version? Which shell do you use? Which OS do you use? What is "not working"? No output? Wrong output? An error message – which one? – knittl Oct 18 '21 at 09:28
7

If you want to remove these files from all your commits, consider rewriting the entire history with the filter-branch command. E.g.,

git filter-branch --index-filter 'git rm --cached --ignore-unmatch -r *.wml' HEAD
adl
  • 15,627
  • 6
  • 51
  • 65
1

If the goal is to remove the files from the repository (thus rewriting history), use the BFG Repo-Cleaner, e.g.:

bfg --delete-files '*.wmv' --private --no-blob-protection

If the files are relevant, you can keep them under version control using Git LFS. To migrate (also rewriting history), you do something such as:

git-lfs-migrate \
    -s original.git  \
    -d converted.git \
    -l https://user:passwd@custom-server.org:8080 \
    '*.wmv'

To simply list or examine the commits, I refer to knittl's answer:

git rev-list --all -- '*.wmv'
git log --all -- '*.wmv'
filipos
  • 645
  • 6
  • 12
1

You can try this:

git log --follow *.wmv

this will list all commits (with hash) that modified wmv files.

0

This can work in gitk as well, using the View / New View / Enter files and directories to include, one per line box.

But note that you need a wildcard that covers the path section of the filename, or else nothing will show.

eg if you have had a file called backup-script.sh, with a varied life (!) appearing in different places in the file tree and you want to see all versions, then you must specify:

*/backup-script.sh
MikeW
  • 5,504
  • 1
  • 34
  • 29
0

To just view the commit hashes and the relevant file names for each commit you can use:

git rev-list --all -- '*.wmv' $1 | while read x; do git diff-tree --name-only -r $x; done | grep -E '((\.wmv$)|(^[^\.]+$))'

This will print out the commit hash followed by any filenames that matching the search string.

wyattis
  • 1,287
  • 10
  • 21
0

Yup, like mentioned, I think the thinko is that removing the commits that introduce them is not going to remove the blobs

See http://progit.org/book/ch9-7.html#removing_objects for an extensive treatment of the subject and examples

sehe
  • 374,641
  • 47
  • 450
  • 633
  • That link is now broken: Try https://git-scm.com/book/en/v2/Git-Internals-Maintenance-and-Data-Recovery#Removing-Objects – Eosis Apr 01 '16 at 08:12