What is the difference between squashing and deleting a commit in git?

Question

Suppose I do

git rebase -i HEAD~3

and the following opens up in a text editor:

pick ae27841 Commit 1 
pick fd8a71e Commit 2
pick badd490 Commit 3

I want to convert these 3 commits into 1 commit so I can push that commit to my repository and then call a pull request. I understand that there are 2 ways to go about this:

I can leave one commit as pick and squash the other two. i.e

pick ae27841 Commit 1 
s fd8a71e Commit 2
s badd490 Commit 3

I can delete 2 of those 3 commits . i.e.
```
pick ae27841 Commit 1 
```

What is the difference between these 2 commands? As I understand it, each commit is a different version of the project. Hence my latest commit will be my latest version right? So my latest commit is all that I need to keep. Since the other 2 commits are 'older' versions of the project, I have no need for them and so I can delete them. So is method 2 the correct way to go about converting my 3 commits into one? If so, what kind of a case will I need to squash my commits instead?

What is the correct way here? Squashing or deleting commits?

Perhaps not what you want to see but if I just wanted to "squash" the last three commit from my project, I'd just do this: ```git reset --soft HEAD~3; git commit -m "squashed commit"``` — eftshift0, May 31 '17 at 15:58

Mark Adelsberger · Answer 1 · 2017-05-31T16:18:25.207

3

UPDATE - From comments it seems the misunderstanding is a bit different from the question, so I've added some notes at the bottom

Original Answer

To squash a commit is to add its changes to the commit that came before it. To delete a commit is to not perform its changes.

So if you have

A --- B --- C <--(master)

where A creates A.txt, B creates B.txt, and C creates C.txt, if you squash the commits you get

ABC <--(master)

where ABC is a single commit that creates A.txt, B.txt, and C.txt whereas if you delete B and C you get

A <--(master)

and only A.txt gets created.

Added Notes

So a git COMMIT object has, among other things, a reference to a TREE object representing your project content at that moment. A TREE is roughly a directory listing, containing a list of names for other TREE objects (subdirectories) and BLOB objects (files).

Internally (if objects are packed) a BLOB might be represented as a delta from another BLOB - but the current revision generally is the complete object, with the delta used to construct the older version of the file.

In either case, in any valid repo you can reconstruct the complete state of the project as it was committed from the TREE reference on the COMMIT; the PARENT is used for history tracking, but is not needed for construction of the project state.

However, when you pick a commit during a rebase, that doesn't mean that you're replicating that commit's TREE; rather it means that git will figure out the diff between that commit's TREE and that commit's PARENT's TREE, and apply that set of changes.

This idea of processing a commit in terms of its difference from its parent is important during both REBASE and MERGE operations. In a way, even though the commit is structured so it can reproduce a snapshot of the project, it's often useful to think of it as just representing that set of changes.

edited May 31 '17 at 16:18

answered May 31 '17 at 16:05

Mark Adelsberger

42,148
4
35
52

Okay so squash it is. Also, how do you decide which commits to squash and which to leave as `pick`? – mjsxbo May 31 '17 at 16:12
Well, squash combines a commit with the commit before. So if you have a list of commits to convert into one, you `pick` the first and `squash` the rest – Mark Adelsberger May 31 '17 at 16:19
I can also manually change the order of the commits in the text editor. Supposing I place the second or the third commit at the top and 'pick' it, will it make any difference? Does it matter which commit is taken as pick? – mjsxbo May 31 '17 at 16:22
It might make a difference if the commits affect the same lines of the same files. If you want to reproduce the effects of the existing commits, don't reorder them. If you find that reordering and squashing is somehow useful, then you're probably making things too complicated. – Mark Adelsberger May 31 '17 at 19:27
What is the difference between `ABC` and `C`? Both have `A.txt`, `B.txt`, and `C.txt`. Can you make an example where `ABC` and `C` would be different? – actual_panda Jul 06 '20 at 00:25
@actual_panda The difference between `ABC` and `C` in this example is that `ABC` has no parent - it is the root of its tree - whereas `C`'s parent is `B`. (The parent is an integral part of the commit.) That is the whole meaning of "squashing a commit" - you're eliminating ancestors from its history. – Mark Adelsberger Jul 06 '20 at 22:08
@MarkAdelsberger alright, but the content of the working directory for checking out `C` and `ABC` is the same, if I understood correctly. After squashing it's like A and B have never existed. Correct? – actual_panda Jul 08 '20 at 05:33
@actual_panda Again, that is what "squashing a commit" means. If `C` and `ABC` don't have exactly the same `TREE`, then what happened would not be a squash. – Mark Adelsberger Jul 08 '20 at 19:17
Yeah, I got tripped up by the "incorporate a commit into the previous commit" wording. In my head it was something like a hybrid of the commits. Thanks for helping out, I understand it now. What added to the confusion is that during a rebase, commits are "treated" as incremental changes, even though commits themselves are not deltas but represent the whole working tree. – actual_panda Jul 09 '20 at 08:22
Right; it would be nice if git always stuck to one set of concepts or the other. Generally a commit "is" a snapshot (though the physical storage may use deltas to reduce total size) but as you note, a few commands treat them like patches. Or maybe it would be better to say, a few commands *operate on patches* rather than commits, but the docs don't make that clear. – Mark Adelsberger Jul 09 '20 at 18:19

score 0 · Answer 2 · answered May 31 '17 at 16:03

0

When deleting a commit, you remove all the changes it introduced from the history, i.e. from all the commits that follow it. Squashing removes the commit from the history, too, but its changes are incorporated into the resulting commit.

answered May 31 '17 at 16:03

choroba

231,213
25
204
289

1

So the current (final) version of my project is not determined only by my latest commit, but all the commits I took to get there? Is that what you're saying? – mjsxbo May 31 '17 at 16:05
1

@NirvanAnjirbag - No, that's not exactly right. The latest commit *does* contain the `TREE` object which - along with its dependent objects - fully defines your current state. But when you rebase, you don't just move commits around; you create new commits by "replaying" the diffs from a series of old commits. – Mark Adelsberger May 31 '17 at 16:08

jack guan · Answer 3 · 2017-06-02T03:40:18.827

-1

At first, this concept about git is wrong here:

As I understand it, each commit is a different version of the project. Hence my latest commit will be my latest version right?

When you delete Commit 2 and Commit 3, Commit 1 will change. let me call it (Commit 1)'. Even though (Commit 1)' keep the same commit message with Commit 1, keep the same modification compare with previous version, they are NOT the same version. (Commit 1)' and Commit 1 have different version hash.

method1: this is what you want. keep the last version of the code, keep last commit message. But you should pay attention that the HASH had changed.

method2: is not what you want. The code commited in Commit 2 and Commit 3 will lost.

edited Jun 02 '17 at 03:40

answered Jun 01 '17 at 03:41

jack guan

331
2
12

*"this ocncept about git is wrong here: ..."* Actually, no. Based on its internal structure, a commit *is* a version of the project, not a modification. Many git commands (like rebase) work by creating a diff between two versions (the one represented by the commit, and the one represented by its `PARENT` -- or an empty tree if there is no parent). You can see the true nature of a commit when you use shallow repos (without the parent, you can see the *version* but **not** the *modification*), or re-parent using `filter-branch` (you replicate the tree, not the changes) – Mark Adelsberger Jun 01 '17 at 13:55

ElpieKay · Answer 4 · 2017-05-31T23:46:58.810

-2

Squashing and deleting are technically the same in Git. They create new commits, taking some existing commits as parents, and no commits are deleted.

Suppose the commit history is ABCD. When squashing BCD, a new commit E is created as a new child of A. The new commit includes the changes of BCD. BCD are still there. The current ref (branch or HEAD for example) then moves from A to E (squash merge) or from D to E (interactive rebase). If the ref was at D and now is at E, it seems BCD are lost but they are still there in fact.

As for "deleting" by cherry-pick C to A, a new commit C' is created as a new child of A. C' has equivalent changes of C (not always exactly the same with C). BCD are still there. The current ref moves A to C'. When you run git reset, the current ref moves from one commit to another.

Another example is git commit --amend. When you run it for ABCD, a new commit D' is created as C's new child. The ref moves from D to D'. Now D seems deleted but it's still there. Amend is said to modify and update the last commit but it in fact creates a new sibling commit instead.

edited May 31 '17 at 23:46

answered May 31 '17 at 23:40

ElpieKay

27,194
6
32
53

*"Squashing and deleting are technically the same in Git"* Not at all true; they do entirely different things. They are both history rewrites, and as such have some similar side-effects, but having those similarities does not make them "technically the same". The most important thing about them - what effect they have - is totally different. – Mark Adelsberger Jun 01 '17 at 14:01
@MarkAdelsberger they do the same thing, creating new commits and rewriting the history. That's what I mean by technically the same. – ElpieKay Jun 01 '17 at 14:49
I'm aware of what you meant. And yet the words you used are incorrect and misleading. They use similar internal mechanisms (in that they are both operations of rebase), but they *are not* by any stretch of imagination "technically the same". In fact given the context that they are "operations of rebase", all that would be left to say is that they're completely different. – Mark Adelsberger Jun 01 '17 at 14:57

What is the difference between squashing and deleting a commit in git?

4 Answers4