7

I think I understand git pull and this is how I explain it in, what I call, "simple terms":

  1. Generally speaking, git pull is about merging a "remote" branch into a "local" branch.
  2. In more detail, git uses the content of the "remote" branch to "update" / "modify" content of the "local" branch.
  3. In even more detail, if a file has been modified in the "local" branch but not in the "remote" branch, then after the merge, the content of the file will be the same as the content in the "local" branch. The opposite is also true. If a file was modified on the "remote" branch but not in the "local" branch, the content will be taken from the "remote" branch.
  4. If a file was modified in both branches ("local" and "remote") than git will try to take modifications from both branches. If the changes happen on different places of the file, both changes will be applied and be present in the content of the file after the merge.
  5. If the changes happen on the same place we have what is know as a "merge conflict" and I am not going to touch this case for simplicity.
  6. As a result of the merge, we modify the "local" repository and therefore we need to "commit".

Now I want to get the same kind of explanation for git pull --rebase. I do not want to use such terms as "head", "index", "fetch", "upstream" because these terms / concept only confuse beginners like me. I know that I need to learn these "advanced" concepts and I do it by reading tutorials but for now, as a part of my learning process, I want to understand git pull --rebase.

ADDED

I think at some point I heard the following explanation. By git pull --rebase. When we merge, we do it not in a "symmetric" way, as described above. Instead, we first "forget" the changes in the "local" repository and apply only the changes from the "remote" repository. By doing that we basically "copy" the remote repository as it is. After that we apply the changes from the "local" repository on top. However, it is still not clear to me what exactly it means. In particular, what "on top" means.

Roman
  • 124,451
  • 167
  • 349
  • 456
  • Note that git pull is a git fetch followed by a git merge. It might actually help to understand what HEAD is, along with the index. If you don't, you'll hit a sandbank if something doesn't go as planned quite quickly. – rubenvb Dec 02 '17 at 10:10
  • As i wrote in the question, I do not know what "fetch" means. – Roman Dec 02 '17 at 10:13
  • 1
    @Roman, you do now... – alexis Dec 02 '17 at 10:15
  • Unfortunately, as alexis said in his answer, "fetch" *isn't* an advanced concept. Neither is Git's index. Some of the tricks you can do *with* fetch and *with* the index are, but these two are basic concepts that you *must* understand. As @rubenvb noted, the very existence of the index will cause problems if you're not aware of it. It's kind of unfortunate that way, and it wasn't explained well to me when I started with Git either. – torek Dec 02 '17 at 19:19

2 Answers2

7

I see two things that could be clarified: You are focusing on the state of a file in the two branches, but a better way to consider what is going on is in terms of the changesets that have occurred. The second issue is that git pull is shorthand for two operations: git fetch, and git merge. Yes, you write that you "don't want to use words like fetch", but that's not an "advanced concept". If you want to understand what's going on, you need to start there.

  • git fetch essentially informs the local repo of changes that it did not know about.

  • git merge unifies the newly arrived changes with your local changes.

The catch is that if things have been happening on both repos without synchronization, they may have diverged:

... b--o--o--o--o  (remote)
     \
      x--x--x      (local)

The above shows time left to right; the rightmost point is the most recent. So the newly arrived changes are modifications to an older state of the files, the one marked "b".

  • git pull, i.e. plain git merge, will merge the most recent state of the two branches as best as it can.

  • git pull --rebase will pretend that your changes were made not to the state marked "b", but to the most current remote state. In other words it will try to rewrite history so that it looks like this:

    ... b--o--o--o--o              (remote)
                     \
                      x--x--x      (local)
    

That's the difference. One consequence is that if you don't rebase, the history of your repo contains some states (which you can rewind to in the future, if you want) where the "x" changes were applied but the "o" changes are absent. After rebasing, there is no such place in the repository.

alexis
  • 48,685
  • 16
  • 101
  • 161
  • thank you for the very clear answer. I have understood everything what you have wrote and I have learn what `fetch` and `rebase` mean. The only thing that I am missing is knowing whether the content of my local branch will be different depending on what I do `git pull` or `git pull --rebase`. In other words, could doing `o-o-o` and `x-x-x` in parallel and then merging give an outcome (content) that is different from the content that we get when we do `o-o-o` first and then `x-x-x`. In yet other words, are `o-o-o` and `x-x-x` commutative operations? – Roman Dec 02 '17 at 11:32
  • @Roman: indeed, there are cases where these *aren't* well behaved, and you do get different results. They are not super common, but they do exist. I advise newcomers to Git to avoid `git pull` entirely: run `git fetch` first, then run either `git merge` if you want a merge, or `git rebase` if you want a rebase. Splitting the two gives you a chance to run `git log` in between, in order to make the decision, as well as giving you a much clearer mental picture of what's going on. – torek Dec 02 '17 at 19:21
1

Simple: as long as your work is local (meaning it has not been pushed), a git pull --rebase will serve to replay your local work on top of an updated history.

git fetch will update said history with the latest commits of the remote repo (origin/master for instance).
Then your work (your local commits of your master branch) will be replayed one by one (which is what a rebase does) on top of that updated history.

The idea is that, when you want to push, said push will be very simple, and will need no merge, since your commits are simply new commits done on top of origin/master.

Note that you can hide that rebase part entirely since Git 2.6:

git config pull.rebase true
git config rebase.autoStash true

And even if the rebase does not goes well and you have to abort, Git will restore the stashed current work for you since Git 2.10.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250