This question is either fairly hard, or ridiculously easy, depending on just what you mean:
git diff --name-only --diff-filter=D <hash-of-B> <hash-or-branch> -- <dir>
Remember that Git stores commits, rather than files. Each commit contains files, but each commit is otherwise an independent snapshot of all files—well, all files that are in that snapshot, but that's kind of a redundant and useless way to put it. Let's just consider a tiny repository with three commits that we'll call A
, B
, and C
so that we don't have to deal with big ugly hash IDs:
A <-B <-C <--master
The branch name master
holds the hash ID of commit C
. In our case we can just look at all the commits and see that C
is obviously last, but in a real repository, with thousands of random-looking hash IDs, it's too hard, so we need someone to hold the hash ID of the last commit.
Commit C
has an author name-and-email and time-stamp, a committer name-and-email-and-time-stamp, a log message, and so on. It also holds the hash ID of commit B
, so that we can go from C
to B
. And, C
holds all the files you want git checkout
to put into your work-tree when you git checkout master
.
Meanwhile B
has author and committer and log message and so on, and holds the hash ID of its previous commit A
. For its snapshot, B
holds all the files you want git checkout
to put into your work-tree when you git checkout <hash-of-B>
.
Commit A
has the usual author/committer/log metadata. It says that there is no earlier commit, so that git log
can stop logging earlier commits for instance. And for its snapshot, A
holds all the files you want git checkout
to put into your work-tree when you git checkout <hash-of-A>
.
So: suppose you have picked out some historical commit, such as B
, from a slightly bigger repository, with two branches master
and develop
and seven commits we'll call A
through G
, arranged like this:
D--E <-- master
/
A--B--C
\
F--G <-- develop
You want to know what's different between B
and ... well, this is where it gets interesting. What does
beyond a certain commit
actually mean? From B
, we can go to C
, if we work in the opposite direction of Git's own internal arrows. But from C
, we can go to either D
or F
, and from there to E
(if we went to D
) or G
(if we went to F
). You need to pick a direction.
Having picked a direction—"in the direction of the tip of develop
", for instance—is it OK to just compare commit B
directly to commit G
? Both are complete snapshots. Suppose B
has files TODO
, d1/f1
and d2/f2
(a total of 3 files), and G
has files d1/f1
, d2/f2
, d2/f3
, and d3/f4
(4 files). You can then run:
git diff --name-status <hash-of-B> <anything-that-finds-commit-G>
and Git will tell you that to change commit B
to match commit G
, you'd have to add (A
) files named d2/f3
and d3/f4
. It might also tell you that you'd have to modify (M
) d1/f1
and it would definitely tell you that you have to delete (D
) file TODO
.
Add to the --name-status
a --diff-filter
to make it print only the names of files that have some particular desired status-es. For instance, if you want to know which files to delete and which ones to add, use --diff-filter=AD
. Git won't mention the files that need to be M
odified, only those that need to be A
dded or D
eleted.
Replace --name-status
with --name-only
to keep the same output as before, minus the status letter. Now you'll see TODO
, d2/f2
, and d3/f3
, without known that TODO
should be deleted. Change --diff-filter
to select only D
files, and you'll no longer see TODO
: the dropped status letter is no longer important.
Now all you need to do is limit the output to just those files whose name starts with d2/
. To do that, tell git diff
to list only such files, by adding the pathspec d2
(you can write it as d2/
or just d2
: if there are files named d2/f1
and d2/f2
there is no file just named d2
: your OS can't hack that so Git won't store that).
But what if, after commit B
—say, in C
or D
or E
—someone added some file, and then removed that file again in commit G
? The above git diff
won't tell you that. If you want to know that, your job is harder. You're going to have to look at every commit along the path from B
to G
.
What if "beyond a certain commit" means down every path, from B
to G
but also from B
to E
? Then you'll have to look at all of those commits.
You must answer these questions for yourself, then choose how to diff.