0

If a file is tracked by git, and it has changed since its last commit, then I can get this:

$ git show --name-status --oneline myfile.txt
fe12828 (HEAD -> master, origin/master, origin/HEAD) Testing
M       myfile.txt

This M would be the "status letter" for a file (https://git-scm.com/docs/git-show says "See the description of the --diff-filter option on what the status letters mean.")

On the other hand, if the file hasn't changed since last commit, the above command returns nothing:

$ git show --name-status --oneline myfile.txt
$

In this case, I can still retrieve the date of last commit with:

$ git log -1 --format=%cI myfile.txt
2021-04-14T19:06:19+02:00

So my question is - is there a single git command (hopefully just by using string format specifiers), that given an input tracked file, would return date of last commit and status letter; say "2021-04-14T19:06:19+02:00 M" if the file has changed since last commit, or "2021-04-14T19:06:19+02:00" if it hasn't?

Short of that, is it possible to get just M myfile.txt from git show (or even just M) for a file, without the commit hash and message as it is currently shown (and without having to parse the output as it with another tool)?

sdbbs
  • 4,270
  • 5
  • 32
  • 87

2 Answers2

0

Closest I got is this (note: "--format="%h %cI"" adds extra empty line as separator):

$ git show --name-status --pretty=format:"%h %cI" myfile.txt
fe12828 2021-04-15T22:54:24+02:00
M       myfile.txt

... and to get just the line with the status letter, can set the --pretty=format: to empty string:

$ git show --name-status --pretty=format:"" myfile.txt
M       myfile.txt

EDIT: note that git show does not show the current status of the file (versus the index), it shows the status in the last commit (as git log -1 would)! To get the status versus the index, use git status --short myfile.txt

sdbbs
  • 4,270
  • 5
  • 32
  • 87
0

The things to know here are:

  • Every commit has every file. To get something like MTABmyfile.txt, Git has to be comparing two commits. (More precisely, every commit has every file that it has—which is why, by comparing the parent commit to this commit, we'll sometimes see a file as being added or deleted: the previous commit lacks the file and this one has it, or vice versa.)

  • The git show command does just that: compare two commits. Actually, it compares two or more commits, which we'll get back to below. But the child commit that git show compares is the one you name on the command line, or HEAD if you don't name one. So you'll compare HEAD~1 vs HEAD here.

  • The git log command, on the other hand, is much more complex. It will walk the revision history starting from some starting point(s). As before, the default starting point is HEAD. Its job is to list some, all, or none of the commits walked.

  • Revision history (or more simply, "history") is nothing more or less than commits. Each commit is numbered, with a unique, random-looking (though not random at all), hash ID, normally expressed as a hexadecimal number. Each commit contains, as one of its two parts, some metadata (the other part is the snapshot of all files). That metadata contains a list of predecessor or parent hash IDs. Most commits, called ordinary commits, have just one parent. This is what gives git log and git show the ability to compare the two snapshots—parent and child—to see if any particualr file changed.

  • An ordinary commit is one that (as just mentioned) has just one parent. A merge commit is any commit with two or more parents. There is a third kind of commit, a root commit, that has no parent; for our purposes here, a root commit terminates a walk, but its comparison to its (nonexistent) parent shows all files added. (Git achieves this simply, internally, by comparing it to the empty tree.)

The particular method by which git log walks the commit graph is to use a priority queue. The arguments you supply, if any, to git log as starting points are turned into commit hash IDs and entered into the queue. If you supply no starting point, Git turns HEAD into a hash ID and puts that single hash ID in the queue.

The graph-walk proceeds with:

  • an option to choose which parents, if any, go into the queue; and
  • an option to choose which commits, if any, actually get shown.

The -1 or -n 1 option, if given, tells git log to quit the walking process after showing one commit.

Pathnames, if given, affect two things: one for both git show and git log, one for git log only:

  • the diffs performed, and
  • the commits shown, for git log.

Normally, a diff between two commits compares every file in each snapshot. Those that are identical are not mentioned. Those that are different are mentioned in some form. The form of mention depends on additional arguments such as --name-status, --name-only, --raw, and the like.

The git log command normally does not run git diff, but when using pathnames, does run git diff; normally, even then, it does not show the result. To make git log show the result, add -p to its options. Whether or not it is showing diffs, git log will, when given pathnames, reduce the diff to just the paths of interest. This is the same behavior that git show and git diff exhibit when they are given pathnames (so at least we have a great deal of consistency here).

When git log has been given pathnames and is doing this sort of internal diff of reduced snapshots—trees, in Git terminology—a commit is selected for display when it has a difference from its parent(s). However, there is a snag here when talking about merge commits. Since merge commits have multiple parent commits, it's difficult to know which parent, if any, to use for a diff.

The solution git log normally takes—when not doing diffs—is to not bother to do diffs, and just walk to all parents. The git show command does not have this luxury, nor does git log when it is being forced to check diffs.

What git show does by default is produce a combined diff. Combined diffs are described in the documentation. I plan to mostly ignore them here. What git log does is more complicated:

  • By default, for showing the diff, git log says nothing at all: you must force a diff with -c or --cc, or a split with -m (there is work in progress to improve this going on today).
  • For walking to one or more parents, git log applies what it calls history simplification. This is something you control, with argument flags to git log. The default action is to pick any one of the parent commits in which the file(s) in question are identical, as long as at least one such parent exists. The remaining parents are then ignored.

So when using git log myfile.txt, you have:

  1. turned on history simplification, so that git log walks only one parent (unless, after diff-filtering, all parents differ; then it walks all parents); and
  2. told git log to print only those commits in which there is a difference from parent to child, for the single file selected.

Adding -n 1 or -1 makes it stop after listing out that one commit.

The presence of merges, if any, means that the walk and the priority queue are important, though if history simplification does find a single "TREESAME" commit (as the git log documentation puts it), the relative priority of multiple commits in the queue becomes irrelevant: the queue only ever has one commit in it at a time.

If you add -p (and, if the diff ultimately found here is in a merge, --cc) to your git log command, you'll get the same kind of M, A, D, R, etc., results from a diff of this commit vs its parent(s) as you would for git show.

Now, on to the answer to your specific question:

So my question is - is there a single git command (hopefully just by using string format specifiers), that given an input tracked file, would return date of last commit and status letter; say "2021-04-14T19:06:19+02:00 M" if the file has changed since last commit, or "2021-04-14T19:06:19+02:00" if it hasn't?

No: it takes at least two commands and/or some scripting. But note that this question itself seems a bit ill-specified: what, precisely, do you mean by "last commit" here? The word tracked is also nonsense here: a tracked file, in Git, is one that is in Git's index, but these commands do not examine Git's index. They look only at existing commits, pairwise (parent vs child, for ordinary commits) or as a combined diff (merge commits, when all parents are diffed against the single child).

Having located a commit—which you can do with either git log here, or git rev-list—you can get its hash ID. Given the hash ID of the child where the file differs in some way from the copy/ies in its parent/parents, you can then examine the time stamps in the child commit and/or the difference in that file / those files. In general I would suggest using git rev-list here, as this so-called plumbing command is designed to be used in scripts, and hence reliably produces the same kind of output regardless of any user's configuration items:

hash=$(git rev-list -n 1 HEAD -- myfile.txt)

The result is the empty string if myfile.txt does not exist in the current commit or any previous commit, or is the root commit if the contents of myfile.txt are identical in all commits all the way back to the root commit. Assuming that we know that myfile.txt does exist in HEAD, $hash will be non-empty.

Note: we had to add HEAD explicitly, because git rev-list, unlike git log, won't do that for us. I left out --cc and any other options you might or might not want since you were not using them originally.

We can now find the parent(s) of this commit:

set -- $(git rev-parse $hash^@)

for instance (see gitrevisions and git rev-parse) or simply run git diff-tree on it. The diff-tree command is another plumbing command, so is likewise insensitive to user's configurations (this prevents the script from breaking if the user has configured git diff to act weirdly). The default action of git diff-tree is to compare the commit with all of its parents, in combined-diff style. It does produce the hash ID of the commit, unless suppressed with --no-commit-id, so we would normally do that too. For instance:

$ git diff-tree HEAD --no-commit-id
:040000 040000 48bad06a0a34a320f48b7f42972cb20236da0e64 affee3f8c35fae08c59d87c66e97704930b99d8b M      Documentation

As this shows, we often want -r so that git diff-tree will look inside sub-trees:

$ git diff-tree HEAD --no-commit-id -r
:100644 100644 f39eede0011738e00c7707746bc12f1ac88fa2ed d69e69ffd765afb9239056939b493895dec43c85 M      Documentation/RelNotes/2.32.0.txt

The default output has this so-called raw format, which is useful when looking at merge commits:

$ git diff-tree HEAD^ --no-commit-id -r -c
::100644 100644 100644 5cd8578b6f387b5a9e75d57a3b61006d28b4e686 741c9f8b2b881d7517af10750ef0bb9eb20f1758 911da181a108d274fba46b30ead171ee0e22d89d MM    Documentation/git-format-patch.txt
::100644 100644 100644 980de590638374c1563845a576b7e8bf54dabfcc af853f11146442983e404e17ac313aa13f0f7446 8acd285dafd874c8ee995e028413f15ff77a316b MM    builtin/log.c
::100644 100644 100644 a20a530d52ab05b3b651a01eb91f53706f1bb72b 097d08354c6151b479de7a2e92561dcacb10c418 a24f72dcd151a3146d971832db9ee703809845b9 MM    revision.h

(this particular merge has empty output when using the --cc style combined diff, which is why I used the -c style instead). The double colons indicate that there are two parents; the double Ms are the status against each parent.

Add --diff-filter=M and -- myfile.txt to limit git diff output to only modified files and only the file named myfile.txt, and add --name-only or --name-status to get the format you prefer:

$ git diff-tree HEAD^ --no-commit-id -r -c --diff-filter=M --name-status -- builtin/log.c
MM      builtin/log.c

The output here will be empty if the file in question is added or deleted. Since we know that the file will be output with --cc (assuming we used --cc with our git rev-list) we don't need -rand--ccafter all: we can, I think, just use--name-only` and test for non-empty output:

path=myfile.txt
hash=$(git rev-list -n 1 HEAD -- $path)
diff=$(git diff-tree --no-commit-id --diff-filter=M --name-only $hash -- $path)
if [ -z "$diff" ]; then
    ... file was added or deleted ...
else
    ... file was modified ...
fi

If the commit found is a merge, and the diff against one parent has, e.g., status A and the other has status M, I'm not sure what happens here: you should test this case.

(Note: we can force Git to skip over the selection of merges with --no-merges, if you want to eliminate this particular complication. That's part of the revision walk, not part of the diff, so it's a rev-list / log option. The revision walking code still walks through the merges, it just never chooses one as "to be output" and therefore never stops after printing one.)

(Note: git diff-tree, when presented the hash ID of a root commit, defaults to printing nothing. Adding --root makes it show every file as added. Since we're going to use --diff-filter=M, which would suppress all of these, there's no need for --root either. However, if you wish to start looking at status letters, consider using --root as well.)

In all cases, you will want to use git log with a --format directive to get the author and/or committer dates from the commit whose hash ID is in $hash. You can, if you wish, combine this with the git rev-list step by using git log directly, but the savings here are probably minimal at best.

torek
  • 448,244
  • 59
  • 642
  • 775