How to (recursively) find/identify git repositoris with unpushed commits

Question

I use tons of separate git repositories to organize my work (>30) in a well-organized folder structure with a single root. Since I work on >= 2 computers I always must make sure that all changes are pushed before I leave my office.

To identify all repositories with files which have been:

changed/added/deleted/etc (i.e. they files are already under version control) or
not tracked yet (i.e., new files which git doesn't know yet what to do with)

... I use the following convenient one-liner (without I could not do my work anymore):

find . -name '.git' | while read repo ; do repo=${repo//.git/}; git -C "$repo" status -s | grep -q -v "^$" && echo -e "\n\033[1m${repo}\033[m" && git -C "$repo" status -s || true; done

(written by hoijui, posted here). Of course I define an alias for it (which I call gitstatus) that I can execute before I leave my office.

This awesome script has sadly one single disadvantage: It can't identify repositories in which I have committed files that I forgot to push. The reason is that the script checks for files with a certain status, but in this case there are no such files. Instead I need to search for an output message of the form:

"Your branch is ahead of 'origin/master' by 1 commit." or
"Your branch is ahead of 'origin/master' by 2 commits."
(Or values >2. I just added both messages to emphasize that these messages have slightly different strings for n=1 and n>1 commits.)

I assume that this is a trivial task for anybody who has written a few bash scripts in the past. I however already struggle reading the script above without investing >1 hour... Also, I was surprised to not have found any similar script on stackoverflow -- even the above-mentioned (incredibly useful) script was not posted here before! (As far as I can tell.) Thus, although I'm technically asking for help here, I thought that even the question itself (due to the script above) might already be helpful to many! :) And the improved script (if anybody provides an answer) would hopefully be even more helpful.

Anyway, thank you!

Just FYI: The dual (complement) to this question is how to find/identify repositories in which I have remote (rather than local) changes. You find it here.

You'll want a bash loop similar to the one you already have, but in which you run `git fetch` to update the local Git repository's idea of what's on the default remote (or `git remote update` to update *all* remotes, similar to `git fetch --all`, but now you have a more complicated problem because there are seven remotes of which three are up to date and four aren't, for instance). Then you'll need to figure out who's "right" and who's "wrong": just because the local is ahead of or behind the remote doesn't mean the local or remote is more-correct. — torek, Oct 28 '22 at 04:15
Once you've written the business logic, plus the usual boilerplate, you have your answer. You don't really need two separate questions here: what you have is the classic distributed database update problem. — torek, Oct 28 '22 at 04:16
This doesn't seem correct to me. Whether there's an unpushed commit is completely independent of the remote repository. Why should I fetch? A simple git status already tells this information (see example messages above), the sole question is how to make this info accessible to script. I also don't see how there could be any disagreement to resolve (quote: "who is right"). We have all information we need: If something wasn't pushed, we see this -- and that can't be wrong. (Maybe you accidentally answered the *other* question (which is also linked)?) — Prof.Chaos, Oct 28 '22 at 05:23
It's completely dependent on the remote repository, because the set of commits that are pushed vs unpushed depends on the set of commits in the remote repository vs the set of commits in the local repository. Consider the triangular setup where I write a commit in repo A, push it from there to repo B, and repo B pushes it to repo C. Repo A has no idea that the commit I wrote is now on repo C, but it *is*. — torek, Oct 28 '22 at 06:50
The [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem) tells us that there's a consistency issue, because Git deliberately chooses both availability and partition tolerance. In the case of an inconsistency, branch tip names will name partially-disjoint histories; some kind of merging action will be required. — torek, Oct 28 '22 at 06:57
This argument doesn't seem valid. My local repository can simply store the information which of its commits were already pushed or not -- I clearly do not need remote repository information about this. Say I had committed 3 changes. Then my local repository *knows* that they were not pushed, simply because that information is stored. And once they were pushed, this information can also be stored! If it's true what you say, then "git status" (which provided the exact information I want to read) will contact the remote repository. And I always thought it does not! Are you really saying it does? — Prof.Chaos, Oct 29 '22 at 03:25
I think I've just realized where you got wrong. I think you'd like to solve a way more complicated problem that I never asked for, namely some sort of fancy comparison of the local repository and the remote one. I however never asked for this. Please read my question again. I only want to find out whether I pushed all my commits. This is not a combinatorial problem, it's trivial. As said (see my question), git status answers that. But I want this 'collected' for all repositories in all subfolders, just like the other command does that I've listed above. — Prof.Chaos, Oct 29 '22 at 03:56
I'm not going to do your work for you. I described one simple approach to solve a simple problem, and the more complex approach needed for the more general problem. It's now *your* job, not mine, to take the information I've given you, and use it. — torek, Oct 29 '22 at 04:08
You don't have to! :) Nobody has; everything everybody does is voluntary, clearly. I was just correcting a misinterpretation. Pretty sure that's the right thing to do! (And I don't think I've received any new information that is relevant for solving the problem I had described.) As said, this problem is not a combinatorial problem of any sort. It's basically a "grepping problem" in combination with fetching the output of some other program (git status in this case). — Prof.Chaos, Oct 29 '22 at 04:15

Prof.Chaos · Accepted Answer · 2022-10-31T00:40:57.407

Found the solution! It was on Stackoverflow after all... (Don't know why I didn't find it initially.)

Anyway, here a function that one can simply add to the .bashrc:

function gitStatus()
{
echo "Repositories with unpushed commits:"
find . -type d -iname '.git' -exec sh -c 'cd "${0}/../" && git status | grep -q "is ahead of" && pwd' "{}" \;
echo ""
echo "Repositories with changed/deleted/added/untracked files:"
find . -name '.git' | while read repo ; do repo=${repo//.git/}; git -C "$repo" status -s | grep -q -v "^$" && echo -e "\n\033[1m${repo}\033[m" && git -C "$repo" status -s || true; done
}

Note that:

the first "find script" is the actual answer.
the second "find script" is the one I already posted in the question itself, which lists all changes/added/deleted/untracked files.
there's a minor inconsistency between the both scripts: The first reports absolute paths, the second relative ones. (But that hardly matters, so I didn't care making this consistent.)

The solution was provided by 'Qetesh' in Stackoverlow's post on Git - How to find all "unpushed" commits for all projects in a directory? Note that there are other solutions in that thread; but this one seems to most elegant to me.

How to (recursively) find/identify git repositoris with unpushed commits

1 Answers1

Linked