1

To list all the files present in a sub-directory at a particualr commit I can check-out that commit and look at the files. Is there a way to list all the files that have been present in a sub-directory at any point (i.e. in any commit in the current commit's ancestors)?

dumbledad
  • 16,305
  • 23
  • 120
  • 273

2 Answers2

2

Yes: git ls-tree accepts both the so-called "tree-ish" and an optional path (which defaults to repository root, if not specified).

A tree-ish is anything which can be resolved to a tree—an object which stores the information about a single "directory" in a recorded commit.

A commit always refers to exactly a single tree object, and a tree object may refer to zero or more other tree objects—representing "subdirectories".

Hence, a commit is always a vaid "tree-ish".

TL;DR

To get the listing of the files under the /foo/bar prefix three commits back call

git ls-tree HEAD~2 foo/bar

Updated to reflect on the clarification of the original question

But I want the files in all commits, i.e. HEAD, HEAD~, HEAD~2, HEAD~3, HEAD~4, … right back through the entire history. Perhaps my question is not clear?

One could roll with a bit of shell scripting:

git rev-list HEAD | while read sha; do
  git ls-tree -r "$sha" prefix/of/interest;
done

…which basically walks all the commits in the DAG reachable from HEAD, and for each of them calls git ls-tree with the name of the commit and the prefix (directory name) of interest.

Note that if a tree object for the prefix does not exist in a particular commit, git ls-tree exits with a non-zero exit code, so if this code is to be run under set -e, this should be acknowledged and compensated for.

This approach might be upgraded in a number of interesting ways.
For instance, one might not just get the list of files or call something like git show --name-status $sha and so on.

kostix
  • 51,517
  • 14
  • 93
  • 176
  • What if a file was deleted in `HEAD~2`? Will `git ls-tree` show that this file existed at some point in the history of the branch? – Code-Apprentice Jul 24 '20 at 15:45
  • But I want the files in all commits, i.e. HEAD, HEAD~, HEAD~2, HEAD~3, HEAD~4, … right back through the entire history. Perhaps my question is not clear? – dumbledad Jul 24 '20 at 16:40
  • @dumbledad, yes, to me it was not clear: I interpret "at any point" as "at any point of my choosing", not as "at all the points they existed at", sorry. – kostix Jul 24 '20 at 17:04
  • @Code-Apprentice, no, it will not. To get the list of commits a given pathname existed in, one would use `git rev-list` (plumbing level command) or `git log -- that/path/name`. – kostix Jul 24 '20 at 17:05
  • @Code-Apprentice, updated my answer with a solution. – kostix Jul 24 '20 at 17:22
0

Ionică Bizău's answer to another question ("How to get ONLY filename with path using git log?") gave me something to work with. This seems to work to get all the files that ever existed under the /foo/bar directory in all of the commits leading to the current state:

git log --name-status -- /foo/bar | grep -E '^[A-Z]\b' | sed -e 's/^\w\t*\ *//' | sort | uniq
  • git log --name-status gives the names and status of all files changed in all the commits (docs).
  • -- /foo/bar restricts the log to the directory of interest (here)
  • | bash pipeline (docs)
  • grep -E '^[A-Z]\b' Gets the lines that start with (^) a capital letter ([A-Z]) followed by a word break (\b), i.e. the lines starting M, D, A, .... (grep docs & diff-filter docs)
  • sed -e 's/^\w\t*\ *//' run the stream editor on s/regexp/replacement/ In this case ^\w\t*\ * looks for lines starting (^) with a word (\w) followed by zero or more tabs (\t*) and then zero or more spaces (\ *) and replaces those matches with the empty string (sed docs)
  • sort (docs)
  • uniq (docs)

(N.B. This solution will not include files added to the working tree or the stage but not yet comitted, so add and commit first.)

dumbledad
  • 16,305
  • 23
  • 120
  • 273