If git only records a "snapshot" of your files, then how does it undo a change?

Question

I am a newbie in version control, and very new in using git. I know that version control systems such as svn store the changes made, while git keeps a record of "snapshots" (commits), so how is it possible to undo a change in git? What is git actually doing?

I also found this: "Are Git's pack files deltas rather than snapshots?" which seems to state git does store deltas.

What do you mean by "undo a change?" Do you mean undo a change to a file in your working copy which hasn't yet been committed? Do you mean move back to a different point in history? Something else? Your question needs to be made more clear. While useful on their own, the two answers you've been given so far don't really answer your question, which may be because it's poorly defined. — Gary Fixler, Dec 04 '13 at 03:53
Start here, see if this stuff makes any sense, before proceeding: http://eagain.net/articles/git-for-computer-scientists/ — torek, Dec 04 '13 at 03:54
Git can create diff by comparing two snapshots. Also capable to apply a diff to a snapshot(commit). — Prihex, Sep 27 '21 at 11:25
@NickVolynkin "If a plane is heavier than air, how does it fly?" That's a perfectly valid question, too. — Sz., Jan 03 '22 at 18:35

score 1 · Answer 1 · answered Jun 17 '15 at 19:46

1

It is indeed true that a git commit reflects a snapshot of your root directory. When you say "undo a change," I am assuming you mean through git revert [commit]. When you run that, git looks at [commit]'s parent (which is also a snapshot) and compares it to [commit]'s snapshot, and creates a diff on the fly. It then takes that diff and applies its "opposite" to HEAD. Interestingly enough, if a commit has more than one parent (i.e. it is a merge commit), you need to specify which parent to use to create the diff with the -m option.

As far as pack files go, yes, they use deltas, but that is a storage implementation detail. It does not store commits as deltas, but individual files. In fact, often it stores the latest version of a file in its entirety, and stores previous versions of the file as diffs. But for all intents and purposes, a git commit is a snapshot.

answered Jun 17 '15 at 19:46

David Deutsch

17,443
4
47
54

" But for all intents and purposes, a git commit is a snapshot" - the same can easily be said for SVN. I'm really not sure where this idea that "SVN stores changes, git stores snapshots" comes from. – Ben Jun 17 '15 at 19:53
Because SVN actually stores changes; if you only have a single commit from SVN without its history, you cannot reproduce the snapshot. With Git, if you have a single commit, you have the snapshot. The fact that you have to specify a parent when you revert a commit shows that the diffs need to be created on the fly. The pack file business is at a **way** lower level than the level of a commit. You can easily have a git repo with no pack files, in fact. This distinction can make a big difference when trying to figure out why a file's "commit history" looks weird. – David Deutsch Jun 17 '15 at 19:58
How exactly would you get a "single commit from SVN without its history"? Are you talking about a single transaction before it becomes a revision on the server? If you have an SVN repository, won't you always have the history, and be able to reproduce any file from any revision in the history? – Ben Jun 17 '15 at 20:05
1

SVN was specifically designed so you could think of it as a series of sequential snapshots of a directory tree: http://svnbook.red-bean.com/en/1.7/svn.basic.in-action.html#svn.basic.in-action.revs . So unless I'm completely misunderstanding what is meant by "stores the changes made" vs "stores snapshots" then SVN and pretty much any other atomic-commit system also "stores snapshots". Even if you're just talking technical implementation details, SVN stores snapshots (at least partially): http://stackoverflow.com/a/2332860/1390430 – Ben Jun 17 '15 at 20:13
1

Where it really makes a difference is in the commit history of a file. In SVN, a file is "part of a commit" iff it is part of the diff. In Git, when there is more than one parent, there is no single diff, so it has to be decided which diff to look at. Depending on the choice the tool makes, you can end up with a history where the last commit does not match the working directory, or a history where commits you did make are not in the history at all. Interesting that SVN sometimes stores snapshots. With Git the implementation is less of an "internal detail" than with SVN, I suspect. – David Deutsch Jun 17 '15 at 20:35
Well either way I guess you answered the OP's actual question. :-) – Ben Jun 17 '15 at 20:41

score 0 · Answer 2 · answered Dec 04 '13 at 02:16

Git keeps a directory called .git/objects with many files. Each file from your project is hashed, and then renamed to the value of the hash and put under this objects directory. This is how git detects identical files and then only saves them once. When you commit git hashes all the files and records their original name. This data is also stored in the reflog which you can read about here:

http://gitready.com/intermediate/2009/02/09/reflog-your-safety-net.html

score 0 · Answer 3 · answered Dec 04 '13 at 02:17

Distributed control systems such as git actually store your entire history on each local copy (every version of every file is in there), whereas centralized ones such as SVN require a connection to a central server. While this means your initial "clone" of a git repository may take longer than it would to say, check out an SVN repo, you have the advantage of being able to access the entire history locally (without connecting to some central server) with git.

score 0 · Answer 4 · answered Dec 04 '13 at 05:47

Git saves your work in 'snapshots' of the state of your working directory in a given point in time, these saves are called "commits"

To see all the commits you have made you can open git shell in a repository and write:

$ git log

what you'll see is the unique SHA-1 hash of every commit and some information regarding it (who made it, time, title, message, etc.). These commits are specific to the branch you currently have checked (unless you have merged this branch with another one).

To answer you question about how to 'undo a change' (or more correctly: how to return you workspace to the state of a previous commit) you first have to find out the SHA-1 hash of the commit you want to return to (it's not necessary to put the whole hash, with the first 6 characters you are ok).

So, let's say I want to return to a commit with hash: 49c005 . What I have to do is write in the git shell this command:

$ git reset --hard (hash code; in my case 49c005)

there are also other ways to use the "reset" command in git, but this is the one I have found easier.

If you need further reference you can always check out the git reset documentation

If git only records a "snapshot" of your files, then how does it undo a change?

4 Answers4