15

Git does not track directories as such. It only tracks files that live in some directory. (See How can I add an empty directory to a Git repository?)

However, if I have certain history of commits I implicitly also have history of changes to the directory tree.

So how do I answer questions like:

  1. when was directory foo/bar created (in git terminology: when was the first file created in that directory). There could be more than one qualifying commit if foo/bar has been deleted some time in history and recreated later.
  2. when was directory foo/bar removed (in git terminology: when was the last file removed from that directory). As above there could be more than one qualifying commit.
  3. what are the subdirectories of foo/bar that existed in any point of time in history

The closest I could come up with is in pseudo code:

loop over all commits (git rev-list --all)
  start from repo root directory
  do recursively on the directory tree rebuilt so far
    call git ls-tree and grep the tree lines
    rebuild next level of directory tree
  end
end

Obviously this could be written in your favorite scripting language.

Then I have all directory trees and I still need to search them in a smart way in order to be to answer questions of type 1 - 3. Again, doable but probably not in a couple of minutes.

The questions are:

  1. Is there an easier way?
  2. If not: are suitable the scripts already on the net? (my googling didn't reveal any, but I did not come up the perfect search words either)
Community
  • 1
  • 1
Uwe Geuder
  • 2,236
  • 1
  • 15
  • 21

2 Answers2

11

For questions 1 and 2, it's quite easy:

  • when was directory foo/bar created?
    git log --oneline -- foo/bar | tail -n 1

  • when was directory foo/bar deleted?
    git log --oneline -- foo/bar | head -n 1

However, the third part is a bit tricky and I cannot answer it completely.

The two commands above give you $FIRST_REV (created) and $LAST_REV (deleted).

The following snippet gives you all commits where the tree was modified:

for rev in $(git rev-list FIRST_REV..LAST_REV)
do
  git ls-tree -r -t $rev | grep tree | cut -f 2
done

Then, you have a list of directories that were present. But there are still duplicates. Pass that list to a sort -u and you're done:

#!/bin/sh
for r in $(git rev-list FIRST_REV..LAST_REV) 
do 
    git ls-tree -r -t $r | grep tree | cut -f 2
done | sort -u

However, you lose the information of the commits where these directories were affected. That's the drawback.

And, this assumes that foo/bar was created only once and is no longer present.

eckes
  • 64,417
  • 29
  • 168
  • 201
  • +1 - you were significantly earlier than me to suggest just using `git log`. If it's OK, I'll leave my answer there for a little while, just to see what the OP's response is to my suggestion for 3. – Mark Longair Aug 17 '11 at 13:50
  • Did you intend to pass the * to the shell or to git? I don't think either will work. For the shell it works only if the content that was added first/removed last is in your working area when you run the command. git log doesn't seem to like the *. Either way, I don't think it works. – Uwe Geuder Aug 17 '11 at 13:57
  • @Uwe: changed answer. Tried it with a test repo and the code above does also work if `foo/bar` is already deleted. – eckes Aug 17 '11 at 14:05
  • @Eckes: Very interesting indeed! Did not know/expect that git log would accept directory names. I'll still wait a while whether any alternative suggestions come up. – Uwe Geuder Aug 17 '11 at 14:18
  • Just commenting your first 2 commands now. They work only if foo/bar has been created and possbily deleted once in history. I have edited my question to make it clearer that I need more than one commit if applicable. Sorry, not to annoy you, my history here is a lot of back and forth, add and remove... Still need to test your loop in real life, but need to run now... – Uwe Geuder Aug 17 '11 at 14:42
5

gitk foo/bar gives you a user interface for browsing the git history limited to commits that touched foo/bar.

wadim
  • 197
  • 2
  • 6