1

I want to know details about a file that was in a git repo at some point but is not in the repo now. There are three scenarios to look for here.

  1. File was renamed. If so what is the current name of the file.
  2. File was deleted. This should be straight forward provided the file was not renamed at some point before it was deleted.
  3. File was renamed and then deleted.

in all these cases I am interested in knowing current name of file if renamed or if the file was deleted.

I am in windows machine using git in powershell but these steps should be easy to replicate in other systems.

STEPS

a. I added 2 files

Add-Content file1.txt "This is file 1"
Add-Content file2.txt "This is file 2"
git add .
git commit -m "Added file1.txt and file2.txt"

b. I renamed file1.txt

git mv file1.txt fileone.txt
git commit -m "Renamed file1.txt to fileone.txt"

c. I deleted file2.txt

Remove-Item file2.txt
git add .
git commit -m “Deleted file file2.txt”

There might be more commits between these commits that does not change these 2 files. This is my git log so far

git log –oneline 
d618114 (HEAD -> master) deleted file2.txt
ba6ec22 Renamed file1.txt to fileone.txt
fe2a51e Added file1.txt and file2.txt

This is what I have so far.

git log --name-status -- "file1.txt"

outputs:

commit ba6ec22e3fdf7e6eb6f33acd83f49f99e9f2610a
Author: Sunil Shahi <myemail@email.com>
Date:   Sun Oct 8 15:35:02 2017 -0500

    Renamed file1.txt to fileone.txt

D       file1.txt

commit fe2a51e9aa5835c5886b31f988e4076155c1194e
Author: Sunil Shahi <myemail@email.com>
Date:   Sun Oct 8 15:31:27 2017 -0500

    Added file1.txt and file2.txt

A       file1.txt

The problem with this is that it shows that the file was deleted in my second commit when in fact it was renamed. If I use current file name with --follow flag, I get more detail but I do not know the file name.

git log --follow --name-status -- "fileone.txt"

outputs

commit ba6ec22e3fdf7e6eb6f33acd83f49f99e9f2610a
Author: Sunil Shahi <myemail@email.com>
Date:   Sun Oct 8 15:35:02 2017 -0500

    Renamed file1.txt to fileone.txt

R100    file1.txt       fileone.txt

commit fe2a51e9aa5835c5886b31f988e4076155c1194e
Author: Sunil Shahi <myemail@email.com>
Date:   Sun Oct 8 15:31:27 2017 -0500

    Added file1.txt and file2.txt

A       file1.txt

This approach is sufficient for deleted files. However I will run into same problem if it was renamed at some point before it was deleted.

Thanks in advance.

Sunil Shahi
  • 641
  • 2
  • 13
  • 31

1 Answers1

2

This is fundamentally pretty difficult; the tools Git provides for this are not really adequate. While git log --follow works fairly well for one file at a time, it does have the problem (as you noted) that it starts with the current name and works backwards. (It also is a hack: see `git log --follow --graph` skips commits and git combining two files into one with history preserved.)

What you can do is to use git log --reverse along with any of the commands that will run git diff between each parent/child pair. For instance, using git log --raw (which runs git diff-tree with rename detection enabled, although I'm not sure if this is because I enable it by default—add -M if necessary) on the Git repository for Git, I can do this:

$ git log --raw --since 02-20-2010 --until 02-28-2010 --reverse --oneline
2d3ca2167 t7006-pager: if stdout is not a terminal, make a new one
:100755 100755 4f52ea573... da0f96262... M      t/t7006-pager.sh
:000000 100755 000000000... 73ff80937... A      t/t7006/test-terminal.perl
9892bebaf sha1_file: don't malloc the whole compressed result when writing out objects
:100644 100644 657825e14... 9196b5783... M      sha1_file.c
ea68b0ce9 hash-object: don't use mmap() for small files
:100644 100644 657825e14... 037515960... M      sha1_file.c
e95a4df46 Merge branch 'mv/request-pull-modernize'
7fa2b1f60 Merge branch 'jn/makefile-script-lib'
92de34894 Merge branch 'jc/maint-fix-test-perm'
25666af37 Merge branch 'jc/checkout-detached'
5f8a0de98 Merge branch 'sp/push-sideband'
db3df36a3 Merge branch 'hm/maint-imap-send-crlf'
cab1b013e Merge branch 'tc/maint-transport-ls-remote-with-void'
241b9254e Merge branch 'ml/maint-grep-doc'
1caaf225f git-diff: add a test for git diff --quiet -w
:100755 100755 60dd2014d... 0391a5827... M      t/t4017-diff-retval.sh
748af44c6 sha1_file: be paranoid when creating loose objects
:100644 100644 9196b5783... c0214d794... M      sha1_file.c
8c33b4cf6 tests: Fix race condition in t7006-pager
:100755 100755 da0f96262... d9202d5af... M      t/t7006-pager.sh
81b50f3ce Move 'builtin-*' into a 'builtin/' subdirectory
:100644 100644 afedb54b4... f1025d5c0... M      Makefile
:100644 100644 2705f8d05... 2705f8d05... R100   builtin-add.c   builtin/add.c
:100644 100644 fc43eed36... fc43eed36... R100   builtin-annotate.c      builtin/annotate.c
:100644 100644 3af4ae0c2... 3af4ae0c2... R100   builtin-apply.c builtin/apply.c
:100644 100644 6a887f5a9... 6a887f5a9... R100   builtin-archive.c       builtin/archive.c
:100644 100644 5b226399e... 5b226399e... R100   builtin-bisect--helper.c        builtin/bisect--helper.c
:100644 100644 10f7eacf6... 10f7eacf6... R100   builtin-blame.c builtin/blame.c
:100644 100644 a28a13986... a28a13986... R100   builtin-branch.c        builtin/branch.c
:100644 100644 2006cc5cd... 2006cc5cd... R100   builtin-bundle.c        builtin/bundle.c
[massive snipping from here onward]

In late Feb 2010, in commit 81b50f3ce40bfdd66e5d967bf82be001039a9a98, Linus Torvalds moved all the builtin-* source files into builtin/*. The above git log limits the commits shown to those in the last week or so of that month. Using --reverse together with --raw and rename detection, we find that the file we might have remembered as builtin-add.c became builtin/add.c: the similarity detector finds that it's exactly the same (R100, 100% similar) but the name has changed.

Note that you cannot use:

$ git log --follow --reverse builtin-add.c
fatal: ambiguous argument 'builtin-add.c': unknown revision or path not in the working tree.

so if you are not sure what the new name is, you must allow Git to view all file names in all commits and search its diff-generated rename-records for a commit that renames the name you are sure about.

The --raw output (which used to be obtained with git-whatchanged, for those of us who were using Git seven or eight years ago) makes it relatively easy to search for '^:.*R.*file-name-you-care-aboutTAB'; from there, you get the new name of the file. If the file is renamed multiple times, you must repeat the exercise with the new name, to find the newer new name.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thank you for that meticulous and insightful answer. Although I am realizing that its a little more work than i expected, It is a fun read. I am assuming the regex in your last paragraph is for grep. Which means I will need to modify it to fit other tool available in windows. I think Select-String should do. – Sunil Shahi Oct 09 '17 at 05:30
  • If you have a Windows Git, you have a shell and grep, because some Git scripts are written in shell script and use commands like sed and grep. But other searchers might be faster or more convenient, yes. – torek Oct 09 '17 at 05:34