What is the most efficient way to merge upstream changes into working copy with Git?

Question

A very common scenario arising when people code together is to have to bring yourself (let's say, the feature branch you're working on) up-to-date. This can give rise to the "conflict" of the same file having been changed in both upstream and your working tree. My question is, what's the most direct way to solve this type of conflict with Git.

Specifically, I'm looking for something concise, and giving good visibility (i.e. knowing what's happening, because sometimes, the conflicts are beyond automatic resolution and one would like to know that before the merge). Something better than this:

# I'm on a branch, have changes in working tree,
  # have overlapping change in remote; for simplicity,
  # assume no local (unpushed) commits have been made (clean HEAD)

git fetch
git difftool origin/<branchName> HEAD   # visually examine incoming changes, to understand them; conclude that the changes are automatically merge-able

#git merge              # fails: "Your local changes to the following files would be overwritten by merge:"
#git merge -s resolve   # fails: same
#git merge -s recursive # fails: same
#git mergetool          # no-op: "No files need merging" ?!

git stash
git pull
git stash pop

git difftool HEAD    # visually examine outcome - it worked, but does stashing really need to be involved?

Other answers involved committing after the fetch, which is a no for me, I have nothing to commit at this stage. What I want, is to bring myself (i.e. my working tree) up-to-date with upstream.

score 1 · Answer 1 · answered Sep 29 '18 at 17:36

1

I think refusing to merge is a bad idea:

First, committing your stuff allows you to restore easily your work when it will be quite impossible if you did not even do a git add of you stuff at some stage. If you need to rework you patches latter on, you may amend them, that's a standard way of doing.

If I well understand you, your work is ongoing, thus you expect your patch to be reworked after the merge, thus I would suggest to rebase your patches If they are quite small at the time: the rebase will point the eventual conflicts that may be easy to fix if your patches are small.

If you have a huge branch that is about to have really a lot of conflicts, I would suggest to finish the patches you are working on before merging so that you won't have "draft" patches in your history when you'll need to push your stuff on the common git repository.

I personally use option 1, even if I have a lot of patches pending. This allows me to revalidate each patch when performing the rebase and also prevents 'evil merges' that I dislike (make git blame painful, IMHO).

In conclusion, I think you should reconsider the fact that you don't want to commit temporary stuff; this gives you way more security than stashes, and allow you to use merge the way you already seems to know.

answered Sep 29 '18 at 17:36

OznOg

4,440
2
26
35

You're touching on the crux of the issue - why are we debating how fast/often should I commit, what does that have to do with keeping up to date? --> I am not "refusing to merge", quite the contrary, I want to merge, but not commit. – haelix Sep 29 '18 at 18:54
Well git is a tool design to commit often and perform this kind of operation as "normal" workflow. If you consider you don't want to commit, maybe you are looking for SVN (on some other versioning system) – OznOg Sep 30 '18 at 10:48
When you say _often_ what do you mean? How often is often? Rephrasing the question, how do you delimit your commits with git? How do delimit commits with other VCS tools? (If used) – haelix Sep 30 '18 at 21:26
compared to SVN (for xample), committing does not mean propagating changes, it is just set a bunch of changes. Whenever you want/need, you can still "amend" you commits so that they reflect better what you wanted to have. THEN you "push" them so that you can share them (this is simplified but it is the idea). To answer the "how often", I would answer: whenever you have something meaningful for you, could be for example when the code compiles. Later, you can still amend to fix code that was, for example, not passing the unit tests. I personally don't fear committing 10 times on the same line – OznOg Oct 01 '18 at 17:41

score 0 · Answer 2 · answered Sep 29 '18 at 17:59

If you have changes—i.e., a dirty work-tree—you have something to commit. At most, you have something you want to commit temporarily.

In other version control systems, you might do this by committing it on a branch. In Git, you can do that too: you don't have to use a branch, but that may be the most convenient way to deal with it in Git, too.

You mention a starting scenario, for which I have a work pattern I find helpful:

A very common scenario arising when people code together is to have to bring yourself (let's say, the feature branch you're working on) up-to-date.

Let's say that you are working on feature-X, which will eventually be put into dev (development branch). Other developers have been working on features Y and Z and one of them has finished and dev is now updated, so you run:

$ git fetch

and see that your dev is now behind origin/dev. That is, you now have:

...--C--D--H   <-- master
         \
          \         I--J   <-- origin/dev
           \       /
            E--F--G   <-- dev, feature-X (HEAD)

in your repository. You also have some things in your work-tree that are different from files in commit G. (You might not have a branch named feature-X yet, and have HEAD attached to dev instead. If this is the case, you can simply create it now with git checkout -b feature-X, and now you match the picture.)

The thing to do at this point is to commit the stuff you're working on anyway. This makes one new commit K:

...--C--D--H   <-- master
         \
          \         I--J   <-- origin/dev
           \       /
            E--F--G   <-- dev
                   \
                    K feature-X (HEAD)

You can now fast-forward your own dev to origin/dev. The basic command method is:

$ git checkout dev                 # safe, since your work is committed
$ git merge --ff-only origin/dev   # or `git pull` if you really insist

The drawing now looks like this:

...--C--D--H   <-- master
         \
          \
           \
            E--F--G--I--J   <-- dev (HEAD), origin/dev
                   \
                    K   <-- feature-X (HEAD)

Here's where most people just run git checkout feature-X; git rebase dev, which is fine and you should feel free to use that method. (I do that a lot of the time. Consider doing that followed by the git reset HEAD^ trick described below.) But what I sometimes do is simply rename feature-X to feature-X.0, and then create a new feature-X with git checkout -b feature-X:

...--C--D--H   <-- master
         \
          \
           \
            E--F--G--I--J   <-- dev, origin/dev, feature-X (HEAD)
                   \
                    K   <-- feature-X.0

I am now ready to begin working on feature-X again, and at this point I simply cherry-pick all the feature-X.0 commits:

$ git cherry-pick dev..feature-X.0

which produces commit K' which is a copy of K:

...--C--D--H   <-- master
         \
          \               K'  <-- feature-X (HEAD)
           \             /
            E--F--G--I--J   <-- dev, origin/dev
                   \
                    K   <-- feature-X.0

This works even if there are multiple commits on feature-X.0:

...--C--D--H   <-- master
         \
          \               K'-L'  <-- feature-X (HEAD)
           \             /
            E--F--G--I--J   <-- dev, origin/dev
                   \
                    K--L   <-- feature-X.0

If the last commit of this new feature-X (L' in this version, K' in the one that had just one commit) is really, seriously not-ready-to-be-committed yet, at this point I just use git reset HEAD^ (you can spell it HEAD~ if that's easier, as apparently is the case on Windows) to move the branch name back one step. This removes the end-most commit from the current branch, giving:

...--C--D--H   <-- master
         \
          \               K'  <-- feature-X (HEAD)
           \             /
            E--F--G--I--J   <-- dev, origin/dev
                   \
                    K--L   <-- feature-X.0

and leaves the work-tree "dirty" in exactly the way it was before I began this whole process. (In general, a partial commit is fine, as I'll use git rebase -i later to clean everything up when feature-X is mostly ready.)

If I have to repeat the process, for whatever reason, I rename the current in-progress feature-X to feature-X.1, or feature-X.2, or whatever. I grow a small collection of feature-X-es and occasionally go out to the branch garden and prune the weediest ones. The latest-and-greatest is still named feature-X, but I have all of my previous work available as needed, up until I weed them out. This is better than either rebase or stash, because if the code is complicated and I miss something, I still have the older version, under a name that's recognizable, not just some incomprehensible hash ID in a reflog, and not a stash that's indistinguishable from a dozen other stashes.

I started reading through your post with interest, but stopped at the point you mention "feature-X.0", where I got doubts that you are seriously proposing this as an answer to my question. My question involved _one branch_, an extremely commonplace scenario that is solved by one _conceptually_ simple operation. Like the other answer, you are also trying to make me commit as if I want to but don't know it. So how many branches and operations does Git require to do this? — haelix, Sep 29 '18 at 19:18
Ultimately, if you have changes to the same files as some incoming commit(s), you *must* perform a merge (as in the verb, *to merge*). The *safe* way to do that is to commit first. Git does not have a front end interface to do this unsafely. So you need a commit. You *can* use `git stash`, which makes commits. I don't *recommend* this method, because when things go wrong (which they do), you're far better off if you have a branch. But you *can* run `git stash`, bring in the new commits, and run `git stash apply`, check that it worked, and `git stash drop` if it worked. You can even [cont'd] — torek, Sep 30 '18 at 06:37
... can even shorten this to `git stash && git pull && git stash pop` (but see https://stackoverflow.com/q/52568548/1256452) or `git pull --autostash --rebase` (but see the question that prompted that question). — torek, Sep 30 '18 at 06:39
_Ultimately, if you have changes to the same files as some incoming commit(s), you must perform a merge (as in the verb, to merge). The safe way to do that is to commit first._ I totally agree on the first part, we agree on the semantics of "merging". But why commit? You highlighted the word _"safe"_. Why does committing bring safety to the merge? Ultimately, the crux of the matter is, why am i expected to merge **inside the HEAD** and not inside working dir where I can naturally adjust the outcome of a faulty merge? (Working dir means it's being **worked on** no?) — haelix, Sep 30 '18 at 21:17
The reason to commit first is that if you don't, merge will destroy your current state. Consider how *to merge* (a verb) works: you have three inputs, which are the merge base and the two tips. You have one output, which is either a new commit, or Git's best effort left in your work-tree. Normally the three inputs are commits, which are read-only and as permanent as the commits themselves. You *can* (by using `git merge-recursive` directly) treat the work-tree as one of the inputs, but since it *overwrites the work-tree* to produce its result, what happens to your input if the merge fails? — torek, Sep 30 '18 at 21:41

What is the most efficient way to merge upstream changes into working copy with Git?

2 Answers2