0

I'm still trying to understand some git concepts. My understanding of branches is that each branch can have its own changes that will only be in that branch, then you can push and merge changes into master. I was being sloppy and making changes on my master branch (not committed), so all those changes carried through to new branches where I didn't want the changes. When I try to revert changes to the last pushed master branch, it is reverting these changes in all my branches. Is there a way I can revert everything in my local master branch and selectively revert changes to specific files in branches I already have made?

For example, lets say I have file1, and file2 in my repo. I'm happy with the latest version of repo. Then I made some changes to file1 and file2 in master but did not commit these changes. Then I decided I wanted to have a branch for each of these file changes so I can work on them individually, so I created new branches file1_update and file2_update from master. Since master had changes, these carried through to file1_update and file2_update. I want to revert file2 in file1_update and file1 in file2_update, and then revert everything in master to the latest version without any changes. Is there a way to do this?

TheStrangeQuark
  • 2,257
  • 5
  • 31
  • 58
  • Have you get the answer which helps you solve the problem? If yes, you can mark the answer. And it will also benefit other members who meet similar questions. – Marina Liu Aug 06 '18 at 06:26

3 Answers3

1

Note: before or after reading the text below (I recommend after), you may also want to look at Checkout another branch when there are uncommitted changes on the current branch.

What Git really does is save snapshots. That's almost all there is to it:

$ git init          # create empty repository: no commits exist yet

Then, repeatedly:

... do some work ...
$ git add <files>   # copy the work into the index
$ git commit        # turn everything that is in the index, into a snapshot

Each git commit packages up whatever is in the index (aka staging area aka cache) right now and turns that into a snapshot, which is permanent—well, mostly permanent—and completely read-only.

We will come back to all of this in a bit.

Commits, hash IDs, and branch names

Except for the very first commit, you always make a new snapshot while sitting on an existing snapshot. The new snapshot gets a commit hash ID—some apparently-random string of hexadecimal numbers, like b7bd9486b055c3f967a870311e704e3bb0654e4f. This is the true name of the commit: it's how Git can use the commit to obtain the snapshot. That lets you, some time in the future, find out what you saved now.

Each commit also records the hash ID of the commit that was the existing snapshot at the time. If we use single uppercase letters, which as mere humans we can comprehend, instead of the big ugly hash IDs, we can call that very first snapshot A. The second snapshot is therefore B and saves the actual hash ID of A inside it. We say that B points to A:

A  <--B

When we make our third snapshot C, we do that while sitting on B, so C points to B:

A <-B <-C

What we—and Git—need to know, then, is what's the latest snapshot? That's what a branch name is really about: a branch name, like master, records the last snapshot. If the latest is C, we have:

A--B--C   <-- master

If we make a new commit D, the name master now needs to remember D. D will point back to C; master does not need to remember C any more, because D will:

A--B--C--D   <-- master

The arrows within commits always point backwards, from child to parent, and since nothing—not Git itself—can change anything inside any existing commit, we don't really need to draw them. But branch name arrows do change over time, so we should keep drawing them.

Now, suppose we make a new branch name like dev at this point. The name dev will record some commit ID. It could record any of the four, but the default is to make it using the current commit ID, which is the one master holds, giving us this:

A--B--C--D   <-- dev, master

Now that we have two branch names, we need to know: which branch name are we using? This is where HEAD comes in: we attach the word HEAD to one of these names. That's our current branch, whose commit ID is stored in the branch name, so if we are on dev, the picture is really:

A--B--C--D   <-- dev (HEAD), master

Now if we make a new commit E, E will point back to D, and Git will update the current name (dev) to point to E:

A--B--C--D   <-- master
          \
           E   <-- dev (HEAD)

If we now run git checkout master and make a new commit F, F will point back, not to E, but to D—that's the one master points to—and Git will update master to point to F:

A--B--C--D--F   <-- master (HEAD)
          \
           E   <-- dev

That's it: that's all that a branch name is and does! It just records the latest commit, which Git calls the tip commit. The good stuff is all in the commits: each commit is a complete snapshot of everything that was in the index.

The index and the work-tree

All the files that are inside a commit are in a special, Git-only, compressed form (often highly compressed, at least for source text files). Git is pretty much the only program that can read them or do anything with them.1 So Git needs a way that you and your computer can read and write to ordinary-format files. Those files go into your work-tree, so-called because here, you can work with them.

Git has, however, an intermediate form for all the files. It takes those compressed, Git-only, read-only files and copies them—well, stuff about them, really—into something Git calls the index. Here, the files are still compressed in a Git-only form, but here, they can be overwritten. It also uses this index to keep track of—to index and cache, hence those names—information about the work-tree files. This is where Git gets most of its speed. There are similar VCSes that don't have an index, proving that it's unnecessary in a theoretical sense, but they are slower (sometimes hugely slower) than Git.

Having provided this index, Git forces you to use the index, even if you don't really want to. Instead of copying files straight from a commit to the work-tree, it copies files from the commit, into the index first, and only then expands them out to normal form in the work-tree. This is why Git makes you run git add every time: what git add does is to copy the file from the work-tree, into the index (compressing it into Git format in the process).

This is how it is that git commit is so fast, compared to other VCSes: Git can just take whatever is in the index right now, package it into a commit, and be done. All the hard work of compressing files is already done! Git does not even have to look at the work-tree.

This also means that after git commit, the new commit you just made, matches the index. Hence, after git checkout branch, the index matches the tip commit of branch, because Git copied the commit to the index while updating the work-tree. After git commit changes branch to have a new tip commit, the index matches the (new) tip commit of branch, because Git copied the index—froze it into a snapshot—to make the commit.


1Nothing can change them: this is a design feature; the actual contents of everything are stored under a crytpographic checksum hash ID. (This is where the hash IDs actually come from. The hash ID is exquisitely sensitive to every single bit, so if you were to change something—accidentally, like a disk error, or on purpose by overwriting it—Git would detect that the object's checksum no longer matches the checksum-key used to retrieve the object. That's why everything, once committed, is read-only.

Commits can be forgotten about, on purpose. Doing so is sometimes tricky, and they will very easily get restored: Git is mainly designed to add things, not to remove them, and is much more willing to add new things than it is to forget old ones. We won't cover this in any detail here.


"But commits look like diffs!"

If you run:

git show <commit>

or:

git log -p

you will see each commit shown as a patch. Git can do this because each commit stores its previous commit—its parent—inside the commit. Git simply extracts both snapshots and compares them. Whatever is different, gets shown.

(There is a complication here at merge commits, but we'll just ignore that, too.)

Revert

What revert does can now be described very simply:2 Git turns the commit into a patch, then reverse applies the patch to some other commit.

That is, if the commit-as-patch says "add a line to file A", Git removes that line from that file. If the commit-as-patch says "remove a line from file B", Git adds that line to that file.

Having reverse-applied the commit to the current commit (through the work-tree and using the index that matches the current commit), Git copies the updated files into the index as if by git add, then makes a new commit, automatically supplying the commit log message. You can override some of these with various flags, and there are complications (see footnote 2) when the patch doesn't apply properly. But that's mostly it.


2This is actually too simple. Revert really invokes Git's three-way merge machinery (as does git cherry-pick). In simple, unconflicted cases, however, "apply a patch and commit" (cherry-pick) or "reverse-apply a patch and commit" (revert) suffice to describe the process.


Revert is a poor name for this process

Mercurial (which is otherwise a lot like Git, only slower and more user-friendly) calls this hg backout rather than hg revert, because it backs out the changes of a commit. The verb revert, often with the auxiliary word to as in revert to, means—at least to some people—to change the entire contents back. That is, instead of saying:

"commit a123456 changed one line of file README.txt and I want that one line changed back"

people sometimes mean:

"README.txt has been changed a lot since commit a123456, and I want the version that was in a123456 back, so that means I want _____"

and they fill in the blank with "to revert README.txt to a123456" and thus they reach for git revert.

That's not what git revert does. To do that, one needs to extract the file README.txt from commit a123456. Confusingly, the main Git command that does this is git checkout, using a different syntax from git checkout branch. (It should have been a separate command, and in Mercurial it is: it is hg revert!) If you want this in Git, you can write:

git checkout a123456 -- README.txt

which copies README.txt from commit a123456 into the index (as usual), then expands it into normal, not-Git-only, format into your work-tree as file README.txt.

Note that in all modern versions of Git, you can also use:

git show a123456:README.txt

which displays the contents of that file, as of that commit, on your screen, and generally works with redirection, so that you can save it to a file inside or outside of your work-tree:

git show a123456:README.txt > restored-readme

for instance. This does not affect the index.

torek
  • 448,244
  • 59
  • 642
  • 775
0

You can't revert if you haven't committed the changes, you might want to git stash some files, then git add the files you want and then git commit with the added files.

Then switch branches using git checkout mybranch and then use git stash pop to add back the stashed files.

EDIT WITH EXAMPLE

Let's say I'm on the branch master, and I modify file1 + file2 without committing. Then I switch to branch toto (git checkout -b toto), file1 + file2 changes will be visible in the branch toto BUT I want only file1 changes on this branch.

Well, I git stash file2 (which will 'reset' the file2 file) then I git add file1 then git commit -m "yeahhhh".

After that, I go back to master branch, and git stash pop so I have my file2 modifications back.

HRK44
  • 2,382
  • 2
  • 13
  • 30
  • But I think I can `revert` specific files in the branch and keep the changes in the files I actually want changes to. Would that not be the same? – TheStrangeQuark Aug 02 '18 at 14:59
  • ``git revert`` is a specific command to revert - sic - committed files. Since you have not committed any file (that's what you are saying in your post), this command won't be helpful in this case. – HRK44 Aug 02 '18 at 15:01
  • Maybe I didn't phrase it perfectly or am misunderstanding something in git. The latest committed changes to `master` are what I want to revert some of the files to – TheStrangeQuark Aug 02 '18 at 15:06
  • @greenthumbtack So are the files committed or nah? – HRK44 Aug 02 '18 at 15:09
0

Your misunderstanding ist that you do not make changes in a "branch". You make changes to the current state of the files on your hard disk. git does not relate this changes to a branch until you check them in.

Timothy Truckle
  • 15,071
  • 2
  • 27
  • 51
  • What is the purpose of a branch then? I thought I could essentially have different versional of the same file in a repo but switching between branches. That's probably a bigger question than just for this comment and question though – TheStrangeQuark Aug 02 '18 at 15:33
  • 1
    @greenthumbtack For this to be true, you have to ``git commit`` the file in the branch, then when you switch branch, you will have different commits that change the file in different ways. – HRK44 Aug 02 '18 at 16:14