What's the equivalent of hg pull and merge in git

Asked Sep 09 '14 at 19:52

Active Jun 19 '17 at 13:53

Viewed 1,872 times

Coming from mercurial, I'm struggling a bit with the terminology and methods for git. Here's the current situation:

Both servers share the same repository.

On Server 1 a developer committed a change to file.php.
On Server 2 a developer committed a change to file.php. Same change, but different commit.

I need to bring these changesets back in sync

To resolve this in mercurial, I would run the 3 commands below from Server 2. Any merge conflicts would get resolved in an editor or meld.

hg pull -u ssh://remote-server1//shared-repository
hg merge
hg commit -m "Branch Merge"

How do you do this with git?

What I've done

Using https://www.mercurial-scm.org/wiki/GitConcepts as a guide, I attempted git pull however it fails since I have not run git config. This repository is in production and shared by many, so I'm trying to avoid setting an username.

Following https://stackoverflow.com/a/17713604/456645, I ran this command:

git fetch ssh://remote-server1//shared-repository

It responds

From ssh://remote-server1//shared-repository
 * branch            HEAD       -> FETCH_HEAD

I have yet to figure out how to properly run git reset in this situation.

git reset ssh://remote-server1//shared-repository/master 
# FAILS (fatal: Invalid object name 'ssh'.)

git reset origin/master
# FAILS (fatal: ambiguous argument 'origin/master': unknown revision or path not in the working tree.)

git reset FETCH_HEAD/master
# FAILS (fatal: ambiguous argument 'FETCH_HEAD/master': unknown revision or path not in the working tree.)

Looking further, I'm confused to only find a single branch

git branch -a
* master

Just to be sure it wasn't auto-merged, I check the history

git log -n 3 
# Here, I find the changesets that previously existed.  Nothing new

At this point, I don't even know what happens to the git fetch. I see no evidence of the fetch.

edited Jun 19 '17 at 13:53

Vadim Kotov

8,084
8
48
62

asked Sep 09 '14 at 19:52

bitsoflogic

1,164
2
12
28

List all branches with "git branch -a". Also, here is a git/mq cheat-sheet: https://github.com/sympy/sympy/wiki/Git-hg-rosetta-stone/eb04a48a4830c1cb71fd04a56e8aaf8b4517329c – folkol Sep 09 '14 at 20:05
1

"origin" in the reset examples above is the name of one of the git remotes (Run git remote -v to list your remotes). You could add a remote to repo 1 with "git remote add server1 ssh://remote-server1//shared-repositor", and then you can use "git pull server1" / "git reset server1/mybranch" etc. – folkol Sep 09 '14 at 20:08
Updated response. `git branch -a` still shows only * master – bitsoflogic Sep 09 '14 at 20:11
Thank you for the follow-up comment as well. That worked perfectly. What's the `git reset server1/mybranch` for? I haven't run that command yet. – bitsoflogic Sep 09 '14 at 20:19
@folkol Also, please feel free to make this an answer and I'll accept it – bitsoflogic Sep 09 '14 at 20:19
Ultimately, I just set the user.email and user.name config settings and followed the instructions for adding remote servers to simplify the pull. – bitsoflogic Sep 09 '14 at 20:21
I added an answer, if the repositories are "bare repositories" (That is, you can not work in them), then you will have to clone repo2 out to your workstation and do it from there. – folkol Sep 09 '14 at 20:24
"git reset someremote/somebranch" will reset your current HEAD so that it points to "whatever someremote/somebranch" points to. – folkol Sep 09 '14 at 20:25

2 Answers2

TL;DR version

Add a remote for the each repository. I'll call the two repositories alex and bob, but you probably should choose more appropriate names. Note that if you git clone one of them, that one will have the name origin automatically (although you can choose another name). The below assumes you already have a related repository, and need to add both, but if you already have origin you just need to add the other one.

$ git remote add alex ssh://their.domain.name/their/path/to/repo.git
$ git remote add bob ssh://bobs.domain.name/and/his/path.git

After that you can easily git fetch everything (all branches, etc) from them:

$ git fetch alex
$ git fetch bob

You now have "remote tracking branches" named alex/master, alex/develop, etc., wherever Alex has branches named master, develop, etc.; and you have bob/master and so on wherever Bob has master.

Now if you wish to merge stuff, you can make your own local branch(es) corresponding to one of theirs:

$ git checkout -b alex-master alex/master

(alex-master is a local branch; you could just call it master if you prefer, but each local branch name must be unique, so you might have to move your own existing master out of the way first) and merge with bob's:

$ git merge bob/master

(this merges bob/master into your local alex-master). If you have permission you can push the result back to Alex's master:

$ git push alex alex-master:master

and also to Bob's master:

$ git push bob alex-master:master

The "refspec" given as the last argument to git push takes your local branch name on the left hand side of the colon, and their local branch name on the right.

Unlike Mercurial, there is no special branch name; master is just conventional, while in hg, default is a bit magic. (Well, there's one special magic bit for master in git: it changes the way git merge spells the default commit message. But that's really about it.)

There are a bunch of parts to this, some of which are mostly cosmetic/convenience, and some of which are crucial.

First, in git these days one mostly refers to repositories through "remotes". A remote is partly a convenience construct but this also plays into the crucial bits, in terms of obtaining and retaining commits from another repository.

Mercurial has something much simpler than remotes, but which serves part of the convenience: in a [paths] section in your .hgrc file you can list the long URL under a short name:

[paths]
short = ssh://something.very.long/with/a/very/long/path/that/is/a/pain/to/type/every/time

or whatever. Compare this to a git "remote", in .git/config:

[remote "short"]
    url = ssh://something.very.long/with/a/very/long/path/etc

So far these are exactly the same, just with different spellings/syntax. But then there's this bit:

    fetch = +refs/heads/*:refs/remotes/short/*

(in fact, this usually appears before the url but the order of these two does not matter). We'll get back to this in a bit.

Next, the most direct equivalent of hg pull is indeed git fetch. However, hg pull has -u, which git fetch most definitely does not do, because the internals of the two operations wind up being quite different because of the difference between a Mercurial branch and a git branch.

A Mercurial branch is sort of a "real thing", while a git branch is much more ephemeral. More specifically / precisely, a git branch is much more like a Mercurial bookmark than it is like a Mercurial branch. However, there's a concept (or maybe "conceit" might be a better word) that simply does not translate properly here: specifically, commits added to a Mercurial repository are numbered sequentially and remain in the repository even if nothing names them: you just wind up with an anonymous head. For instance you can have a commit graph (shown with hg log --graph etc) that looks like:

o
|  o
o  |
|  o
o /
|/
o

There does not need to be any external name pointing to these two heads; they're just anonymous heads (within a branch since all commits in hg are within some branch by definition).

In git, by contrast, a commit with no external name is eligible to be garbage-collected (what hg might call "stripped", although the details differ). So all of what hg calls "heads" must have some kind of name. This is where a remote's fetch = line comes into effect.

"But wait," you might ask, "I can do git fetch ssh://... and that has no named remote!" This is where that HEAD -> FETCH_HEAD bit comes in.

If you don't have a "remote", you can still bring in some branch (or even more than one) from another repository. However, the heads of any branches thus obtained need a name. This "name" is stuffed into the pseudo-branch named FETCH_HEAD. The details get complicated, but if you bring over only a single branch, as in this case, it's easier to explain and think-about: essentially, git brings over the one named branch and, in your local repository, gives this the name FETCH_HEAD.

Note that a subsequent git fetch generally overwrites FETCH_HEAD, at which point all the commits you brought over are eligible for garbage-collection. Thus, you need another step, once you have the commits.

Now, once you have some commits and want to retain them, there are different ways to do this. The most hg pull-like is to use the "remote" mechanism. Here, that fetch = line takes care of the problem. When you git fetch from the remote, the remote has branch names pointing to all the branch heads (by definition since git requires this). These branch names are references of the form refs/heads/name.

The fetch = line tells git how to rename the branches so that they have new, unique local names in your local repository. By renaming refs/heads/master to refs/remotes/short/master, for instance, you get short/master as a new local name for what the remote called "branch master". This retains the commits and we're good (though we still have to merge).

Alternatively, we can just get the one interesting branch over and call it FETCH_HEAD and then immediately merge. When we do this, we pick some local branch (e.g., master), check it out if needed, and run git merge FETCH_HEAD.

Once the merge is done, we have our local branch (master, in this example) pointing, perhaps through a merge commit, to the commits we brought over via git fetch, so they are no longer vulnerable to a garbage-collection: they are on the branch, hence name-able via that branch name. We can now safely overwrite FETCH_HEAD any time.

This last method—bringing stuff over but naming it FETCH_HEAD, then merging a local branch with FETCH_HEAD—is what the "git pull" command does. In that sense, it's just git fetch followed by git merge. I personally try to avoid the pull script as it has a lot of extra magic to handle various corner cases, which in the old days would go awry in some un-considered corner cases; and in general I prefer to fetch, look over what happened, and then choose whether and how to merge or rebase my own code.

Ultimately there's no single right way, but if you are going to do this more than once, I would add a git remote, and use the "remote branch" notion (the fetch = stuff that renames "their" branches to your own locally-stored "remote-tracking branches").

edited Sep 09 '14 at 21:37

answered Sep 09 '14 at 20:45

torek

448,244
59
642
775

This was an eye opener! I really appreciate the comparisons with Mercurial terminology. Also, the more subtle clarification clues ("check it out if needed") were a big help to understand the full picture. – bitsoflogic Sep 10 '14 at 13:58

EDIT: Normally, a git remote is a "bare" repository, which means you can not have a checked out working copy on the server. This will work even if the server repo is a bare repo.

Clone one of the repositories to your local computer:

$ git clone ssh://remote-server1//shared-repository

cd into that folder:

$ cd shared-repository

Add the second one as a remote with name server2:

$ git remote add server2 ssh://remote-server2//shared-repository

Fetch and everything from the repository on server 2 (Same as nq pull):

$ git fetch server2

Merge the master branch of server2 into your local master:

$ git merge server2 master

Push the result back to server 2:

$ git push server2 master

Push the result back to server 1 (origin is the remote that you cloned from):

$ git push origin master

edited Sep 09 '14 at 21:12

answered Sep 09 '14 at 20:22

folkol

4,752
22
25