0

Is there a Command to see all changed files and the Branch where they were changed?

I Know About git diff --name-only Branch but i need a Command that checks for changes on every branch.

Any Suggestions?

To make it More clear:

Its for a Tool that should Compare your local Files With The Files on the Server. If the the Code in a File is different to the Code in the same File's "Server Version" than this file should get written to the output. Additional to that I want that the tool writes the branch in which the File was edited Also to the output

Output:

Uncomitted changes in [FILENAME] in Branch [BRANCHNAME]

Trks
  • 27
  • 7
  • 1
    What do you mean "all changed files"? Changed when compared to what exactly? In the most extreme interpretation you'd get a list of the cross product of every branch/tag/commit and which files are different between them. If you told us *why* you need that (i.e. what you want to achieve with that) then we'd probably understand your goal better. – Joachim Sauer Oct 06 '22 at 11:30
  • I need a Comparision between my uncomitted changes and the branch. – Trks Oct 06 '22 at 12:06
  • that comment makes things more confusing. Please may you clarify what you're trying to do and _why_? – evolutionxbox Oct 06 '22 at 12:07
  • Okay, I need a Command that checks all files of a certain repo and gives the ones that have changes on them (Comparision between local file and file on Server), back as output. I need the Command to show in the Console What Files Where changed and on wich branch the changes happened. Its for a tool that checks multiple repos for uncomitted changes – Trks Oct 06 '22 at 12:12
  • 1
    I'm sorry, maybe I'm being dense, but things just get more and more confusing. You've got a.) the current working copy (not checked in), b.) the currently checked out branch, c.) all the other branches and d.) *entire repos* that are unrelated to this one. If BRANCH_A and BRANCH_B differ in FileX and your local working copy has a new FileY, what exactly do you want the output to be? Maybe give your own example of what you want output to be. [Edit] your question to include that instead of explaining in the comments, please. – Joachim Sauer Oct 06 '22 at 12:18
  • I Edited my Question, hope its clearer now – Trks Oct 06 '22 at 12:33
  • What do you mean by "the branch in which the file was edited"? Are you wanting to know about diffs between your current working directory and a particular branch on the remote and wanting to know about which branch was merged into the remote branch that committed the last change to a particular file? Or is "the branch in which the file was edited" referring to a local branch? Or do you just want to know what was the name of the branch that was used to make the last change to the file? Branches are pretty ephemeral, so none of these really make much sense. – William Pursell Oct 06 '22 at 12:39
  • "Or do you just want to know what was the name of the branch that was used to make the last change to the file" exactly – Trks Oct 06 '22 at 12:42
  • @Tom.T branches move all the time, you may not be able to find out that information. You can see what _commit_ a file was changed in – evolutionxbox Oct 06 '22 at 13:39
  • "Tool that should Compare your local Files With The Files on the Server" is homework, and written by someone colossally ignorant of Git. What server? Why there? Git compares files available in history or a work tree,. – jthill Oct 06 '22 at 16:19

1 Answers1

2

Is there a Command to see all changed files and the Branch where they were changed?

No. In fact, there are no changes in Git: Git stores commits, and each commit holds a full snapshot of all files, not changes. Moreover, branches—in the sense you're thinking of—aren't real. They simply don't exist, not the way you're thinking anyway. To make any sense of this—indeed, to make any sense of Git at all—one must stop thinking in terms of files and changes, and think instead in terms of snapshots, i.e., commits.

[Desired output would include lines of the form] Uncomitted changes in [FILENAME] in Branch [BRANCHNAME]

Uncommitted work is not in Git. As it's not in Git in the first place, it's not in any branch, regardless of how you define branch. So this is literally impossible.

How to think about Git

To work with some existing software that has been stored into Git, you begin by cloning a repository. These two terms need to be defined, and the first one to define is "repository".

A Git repository is, at its heart, really two databases. One is usually much bigger. It contains Git's commits and other internal Git objects. All of these objects are numbered, with unique but random-looking hash IDs which are simply large numbers expressed in hexadecimal, such as dda7228a83e2e9ff584bf6adbf55910565b41e14. The number for a commit is unique, across all Git repositories, even those that do not contain that commit, which is highly magical and actually mathematically impossible.1 Still, it works fine in practice.

A side effect of this magical numbering scheme is that once a commit is made, no part of it can ever be changed, not even by Git itself. The ID of a commit is exquisitely sensitive to the value and position of every bit of the commit, so that if you take a commit out of the database, change just one bit, and put the result back, what you get is a new, different commit with a different number: a different unique hash ID. The old commit still exists; you have simply added on to the set of commits in the database.

Now, these big ugly hash IDs are just that: big and ugly. Humans are particularly bad at them. We just can't remember them! So to avoid forcing us humans to memorize hash IDs, Git has a second—usually much smaller—database in the repository. This smaller database maps names, such as branch and tag names, to specific hash IDs. Each name can only store one (1) hash ID, but Git is cleverly built so that this is sufficient.

These two databases, plus a few ancillary items, are all that is required of a Git repository. Some repositories consist of only that much, but there's a catch here: since nothing in the big database can ever change, you can't get any work done in this kind of repository. (In the smaller database, with names and hash IDs, everything is changeable, and things change a lot. But the big database is basically append-only: you can only add new stuff.)


1This means that someday, Git will fail. The sheer size of the numbers puts that day so far into the future that we can hope to all be long dead and gone before it occurs. Certain malicious acts could hasten this, though, and Git needs to move to a new even-larger hashing scheme; progress on this is underway now. See also How does the newly found SHA-1 collision affect Git?


Commits

Before we move on to how to get work done, though, let's talk a bit more about commits. Commits are the reason Git exists, and commits are stored in the big database. Each commit is numbered, as we saw. Because commits are in this append-only database, they are read-only. And, each commit is little more than a way to store a full snapshot of every file.

That "little more" is important though. That extra, above and beyond the full snapshot, is crucial for making Git work. Every commit stores two things, not just one:

  • Each commit stores, indirectly (so that commits can share stuff), all the files that go with that particular commit. That's your snapshot. Once made, that snapshot is as permanent as the commit itself (which is technically only mostly permanent although it's really hard to get rid of a commit and there's no "remove commit" operation), and fully read-only.

  • Meanwhile, each commit stores, directly, metadata, or information about the commit itself. This includes such information as the author (name and email address) of the commit.

In that same metadata, each commit stores a list of previous commit numbers. Most commits have exactly one entry in this list; Git calls such a commit an ordinary commit. A few commits have two (or more) entries and Git calls those merge commits, and any non-empty repository has one "very first" commit which by definition can't have any previous commit and therefore doesn't (Git calls this a root commit although you don't normally need to care about that).

This list-of-previous commits, with ordinary commits just listing one previous ("parent") commit, is how Git stores history. The history in the repository is nothing more or less than the commits in the repository. And that's almost the whole story right there: the commits in the repository—in the big all-objects database—are the history and store all the files, and that's mostly what a Git repository is about: storing files, forever, or as long as the commits exist anyway.

The repository stores the commits and the commits store files, and that's almost all there is to it. The catch—or two catches, really—is that if you want to use Git this way:

  • you have to memorize some hash IDs, and
  • you have no way to get any new work done.

So the rest of Git exists to fix these two catches.

Branch names help you and Git find commits

I am not going to go into a lot of detail here (for said detail, see many of my other Git answers), but each branch name in Git simply stores the hash ID of the latest commit that we want to consider "part of that branch". This lets Git quickly and easily find that particular commit.

Since commits themselves remember previous commits, if we can find the latest commit, Git can use that latest commit's metadata to find the next-to-latest commit. The second-latest commit's metadata lets Git find the third-to-latest commit, and so on. Other branch names can also point to earlier commits, and this means that many commits are on many branches. In a typical Git repository that has exactly one root commit, that root commit is on every branch. This is very different from most version control systems: in most VCSes, branches are something "real", in that you make a commit on some branch, and from then on, that commit is on that one branch and only that one branch; but in Git, branches are fluid, constantly changing which set of commits they "reach", and each commit is often on many branches. You can also have commits that are on no branch.

For a file to be in a repository, that file must, in general—there are some specific exceptions but they're not used this way in practice—be stored in a commit. That commit is on however many branches it's on, possibly none. The file can't be changed, in the same way that the commit can't be changed. So the files inside the repository are all read-only, just as the commits are all read-only.

Getting work done: picking a branch name and commit

Once branch names solve the hash-IDs-are-impossible-to-remember problem, the remaining problem is that with the files in commits being read-only, it's impossible for us to get any new work done. To fix this, a normal repository—as opposed to the bare ones found on servers—comes with a working tree or work-tree.

While this work-tree comes with a normal repository, it's not actually part of the repository. Until Git 2.5, each repository could only have one working tree. The new-in-Git-2.5 git worktree command added the ability to add more working trees. It's probably best, though, to start with the notion of "the" working tree of a Git repository, as a normal repository has one particular distinguished working tree that cannot be removed (vs the extra, added ones that can be removed at any time).

The (or a) working tree is normally "on" some branch, or else is in "detached HEAD" mode and thereby on no branch. You select this branch's name with git switch or the older git checkout command:

git switch foo

means switch to existing branch foo, or, if there is no existing branch named foo, try to create one using the guessing code. However, you can always, at any time, create a new branch name without changing anything else:

git switch -c newbr

creates a new branch name, newbr, that selects the current commit without changing anything else.

The current commit is in a key way much more important than the current branch name. The branch name is merely a way to find the commit's hash ID. Creating a new branch, or renaming the current branch, changes the name, but does not change the hash ID.2

If you use git switch to switch from branch br1 to branch br2, both names already exist, and both already select some commit. If both names select the same commit, the switch always works, and changes nothing—but if the two names select different commits, then the working tree magic springs into action.


2You can use git switch or git checkout to create a new branch at some commit other than the current commit. If this action succeeds—it can fail—it does change the current hash ID as well as the current branch name. The commit you select here must already exist, though, and it must be possible to switch to that commit. The effect is as if you said "switch to that commit in detached-HEAD mode, then create a new branch name using that now-current commit".


Getting work done: checking out a commit

When you move the current working tree from one existing commit to another, Git effectively removes, from your working tree, all the files that went with the old existing commit, and then re-fills your working tree with all the files that go with the new existing commit. This is the process of checking out a commit, and in the old days we used git checkout to do that. In Git since 2.23, you're encouraged to use git switch instead. This doesn't really do anything different for this particular case. The reason you're encouraged to use the new command is that the old one has traps in it for the unwary (ones that even sometimes bite the wary, old-hand Git users).

Git will, when you're doing this file-changing, normally check to make sure that you haven't modified some files first. That is, suppose you checked out commit a123456... and it contained a file named README.md. Git will have created this file in your working tree, and it will contain the same contents as the README.md file stored permanently inside the commit.

If you've then edited this file in your working tree, and try to switch to a different commit with a different README.md content, Git would have to remove your working tree file and replace it with the README.md content from the other commit. This would destroy your uncommitted work. So Git says "sorry, I can't do that right now".

Git does take some short-cuts here. Git can tell (because of the way files are stored in commits) which files are different in one snapshot vs another. So if you're changing from commit a123456... to commit b789abc..., Git knows which files exist in both commits as identical content, and doesn't bother ripping those out and putting them back in. So if you're switching from a123456... to b789abc... but the README.md in these two commits is the same, Git can leave your modifications alone as you switch.

This, in fact, is what leads to the ability to create a new branch at any time. If you create a new branch without changing commits, you aren't making Git swap out any files either. So making a new branch at the current commit never has to swap out any files, and therefore it's never stopped by the file-switching "whoops you'd lose your work" problem.

There is a lot more to it than this: I haven't touched on Git's index (aka the staging area) at all, for instance. But this is the key to the problem with your question as asked: any files you have that have uncommitted work merely exist in your working tree. Your working tree may be on a branch at the moment, but that branch name is not actually relevant. You can rename the branch or create a new branch. The commit hash ID might matter, or might not, because the changes could be carried over to another branch if the underlying committed file is the same in the old and new commit hash IDs.

The bottom line

If you want to see uncommitted work, just run git diff or git diff --cached or git diff HEAD (these three all do something different and the details depend on that index / staging-area that I didn't cover above). This uncommitted work is not in Git. If you overwrite it, Git cannot help you get it back.

If you want to know what branch you're on right now, use git status or git branch or similar to show that information. The branch name is not actually important though (except perhaps to you of course): what matters is the commit hash ID (except, perhaps, to you, since hash IDs are impossible for humans to work with). There's only one commit hash ID, and either one branch name (normal mode) or no branch name (detached HEAD mode) for your working tree.

(If you have added working trees, each one comes with its own HEAD, index, and working-tree files.)

torek
  • 448,244
  • 59
  • 642
  • 775