1

Suppose two users are working on the same branch (e.g. master). I know this premise is not good practice, but let's use it for simplicity's sake.

The first user runs these commands:

git clone <repository_url>
cd myproject
touch file1.cpp
git add file1.cpp
git commit -m "file 1 creation"

The second user then runs:

git clone <repository_url>
cd myproject
touch file2.cpp
git add file2.cpp
git commit -m "file 2 creation"

First user:

git push origin master

Second user:

git pull

Now, Git is performing a merge.

At this point, I sometimes see master|MERGING in my bash prompt (i.e. before $) so Git automatically created a branch for merging. In other instances, Git does not create a new branch, because my bash prompt still says master.

If it helps, I am working in different operating systems (Linux, Mac, Windows).

So does Git create a branch when merging?

Thanks.

RoShamBo
  • 41
  • 4
Bob5421
  • 7,757
  • 14
  • 81
  • 175

5 Answers5

1

Doing git pull is the same as doing git fetch followed by git merge. At the end of the second user's merge, Git will create a new merge commit, but it won't, by itself, create a new branch.

The master|MERGING you see on the console just means that Git is in the middle of a merge operation.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • @Bob5421 It just means you are in the middle of an unfinished merge. You may try running `git branch -a` when you see this. My guess is that you won't see any new branches. If Git does create a temporary branch, it is not visible to us. – Tim Biegeleisen May 27 '18 at 09:04
  • Thanks all, but how can i see if i am in merging status if i do not see | MERGING in the console ? – Bob5421 May 27 '18 at 09:09
  • @Bob5421 That strikes me as something of a circular question. If you do a merge and you _don't_ see `MERGING`, then it usually means that Git was able to complete the merge without input from you. – Tim Biegeleisen May 27 '18 at 09:20
1

Your question reveals a lot of confusing about what a Git branch is and how merges work in Git. (This is not all that surprising since a lot of introductions to Git are ... not good, and Git has a lot of underlying complexity that really cannot be ignored.)

I recommend that anyone new to Git avoid git pull. Instead, run git fetch, and then if you want merges, run git merge. If you prefer a rebase-oriented work flow, run git fetch, then run git rebase. This will give you a better sense of what's going on underneath. It will also avoid a lot of mistakes that people almost always make as they first start using Git. The opposite of push is fetch, not pull.

You mention that:

Sometimes ... I see master|MERGING in my bash prompt prefix

You can also run git status to see whether you are in the middle of a conflicted merge.

In fact, what you have in your bash prompt is a set of clever interactions between your bash shell and your Git commands, where your bash checks with Git each time it's about to print a prompt. If Git reports that you are in a repository at all, the shell includes the repository's current-branch-name in the prompt. If you are in the middle of one of these conflicted merges, you get |MERGING added as well.

But this leads back to the question: What exactly do we mean by "branch"? Click on that question to read more. The word branch can refer to either a branch name like master, or a subgraph within the DAG (what I like to call a DAGlet; see that other question).

The git merge command does not create a new name, so in this sense, it never creates a branch.

On the other hand, the git merge command can tie together two subgraphs within the DAG—creating a new DAGlet. In this sense, it does create a new branch. Well, it does so sometimes, and it would be more accurate to say that it ties together some existing DAGlets. This is where that unavoidable complexity comes in.

Let's take a moment to examine commits

Probably the most important thing in Git is the commit. In Git, the true name of any commit is its hash ID—some big ugly string of hexadecimal digits, apparently random, basically useless to humans, but each string is unique to that particular commit and identifies it. That's how Git finds it: you tell it to look for some big ugly hash, a1f9c32... or whatever, and your Git looks up the commit.

Each commit stores the ID of its previous or parent commit. What this means is that if you tell Git about the latest commit, Git can use that commit to look back to the second-latest commit. That second-latest commit has inside it the ID of the third-latest commit, and so on. If the commit IDs were just easy uppercase letters, we could draw them like this:

... <-F <-G <-H

where H is the latest, and it points to (contains the ID of) G, which points to F, and so on, backwards through history.

These commits are the history; all Git needs to know is which one is the latest. That's where a branch name like master comes in: instead of making you memorize the crazy hash IDs, Git stores the latest one under a name like master.

This means that when you add a new commit to your master, what you are doing is having Git save, in a new commit, the ID of the old tip of the chain, and then having your Git rewrite your master to hold the ID of the new tip:

...--F--G--H--I   <-- master

Now I points back to H, which (still) points back to G, and so on.

There is more than one Git repository involved

You start with this:

Let's suppose I have 2 users working on the same branch (master).

Regardless of whether this is good practice or not, there's a sense here that the name, master, means only one thing. But that's just not true, because you have a Git repository, and the second user has a Git repository, and the place you're git pushing to has a Git repository. Everyone gets a car upvote meme repository! And every one of you has your own master.

What you all share, however, are some set of commits. You all started out by cloning some repository, and that had some set of commits, and you got them all. You may have added some more since then. So, for each commit hash ID, your Git repository either has that commit, indicated by its ID, or doesn't. If your Git repository doesn't have that commit, that's where git fetch and git push come in.

What git fetch and git push do is to connect two Git repositories. At this point, whoever is doing the sending—your Git if you are the one doing git push, or the other Git repository if you are doing git fetch—packages up any commits they have that you don't, or vice versa. The sender delivers that pack of commits (and the files that go with it) to the receiver.

The receiver now has a bit of a problem, because commits that are only identified by hash IDs are pretty useless to humans. The receiver needs to give the last commit a name.

When you run git fetch origin,1 you obtain new commits from their master, so the name your Git uses to remember their master is origin/master.


1Here, origin is the name of a remote. Most repositories have exactly one remote, named origin: when you run git clone <url> to clone a repository, the clone process sets up this remote, whose name defaults to origin, to remember the URL.


Fetching, then merging

Let's suppose that you both started with a chain of commits ending at H, and you've added I--J and they—whoever they is here—added K--L:

...--F--G--H--I--J   <-- master
            \
             K--L   <-- origin/master

It's now your job to combine your work—whatever you did in commits I-J—with their work, whatever they did in K-L.

The simplest method of combining in Git is git merge. This particular kind of merge, which I like to call a true merge, works by quite literally combining your work and their work. To do this, it has to start from the point where you two branched apart. Note that this branching-apart has nothing to do with the name master itself. It's because you made commits, and they made commits.

The merge operation has two parts. The first is what I like to call merge as a verb, or to merge: to combine work. Now, it's clear from looking at the drawing above that the last commit you both had in common was commit H. This is what Git calls the merge base.

Git now runs git diff twice.2 You can do it yourself:

git diff --find-renames <hash-of-H> <hash-of-J> > /tmp/what-we-did
git diff --find-renames <hash-of-H> <hash-of-L> > /tmp/what-they-did

You can now compare what you did to what they did. Git does this same thing in order to combine your changes with their changes.

If the things you changed are "far enough away from" the things they changed, or are in different files, Git will combine them successfully. More precisely, Git will think it combined them successfully. (It's up to you, the human who is smarter than Git, to decide on the real success here.) But if you changed the same source lines that they changed, Git won't be able to combine two different changes. Git will throw its metaphoric hands into the air, declare a merge conflict, and stop and make you clean up the mess.

This is when you see the merging status. Git has stopped in the middle of a conflicted merge. The commit graph still looks like the picture above, with two "branches" (in the DAGlet sense) forking off from a common commit; they have not yet come together. It's now your job to edit the mess into something sensible, run git add on the result, and use git commit (or in new enough Git versions, git merge --continue—but this just runs git commit) to finish the merge.

If Git thinks it can do the merge all on its own, though, git merge will go ahead and run git commit on its own too. Git won't stop in the middle with a conflict; it will just go on to the git commit part.

This commit-that-concludes-a-merge, whether Git does it all by itself or stops and makes you clean up and do it, will tie together the two graph DAGlets:

...--F--G--H--I--J--M   <-- master
            \      /
             K----L   <-- origin/master

The new merge commit M has two parents, instead of just the usual one. This is the second part of what git merge does: it creates a merge commit, which uses the word merge as an adjective. The merge commit ties together these two DAGlets. Merging did not create the graph fragments. Merging simply tied them together.

Making this merge commit concludes the process of merging, so that you are now back to the normal, non-merging state. However, you now have a new commit that, obviously, no one else has: you have merge commit M, which you just made, which therefore has a new and unique hash ID that no one else could possibly have yet.

(You can now use git push to share any commits that you have that they don't.)


2Internally, Git uses a whole bunch of short-cuts to avoid a lot of work if possible, but in the end, the combine part does require computing the diff.

I'm also leaving out a lot of detail here about how the to merge process works. This matters mainly when you have to clean up the mess of a failed merge: the merge takes place in your index (also called your staging area) as well as in your work-tree (where you do your work). The separations between commits, the index, and the work-tree matter more as you start to do more advanced things in Git.


When a merge isn't a merge

Not all git merge operations merge! Suppose you haven't done any work since you ran git clone, so that you have, say:

...--F   <-- master, origin/master

Now you run git fetch and pick up new commits G and H:

...--F   <-- master
      \
       G--H   <-- origin/master

Note that G points back to F, which is where you are now. If you now run git merge origin/master (or just git merge) to bring yourself forward, Git notices that there is no actual divergence here. Instead of combining your lack-of-work with their work, Git can simply fast-forward the name master so that it points to commit H, and check out commit H, giving you:

...--F--G--H   <-- master, origin/master

When git merge does this, it says "fast-forward": there's no diffing, no combining of work, and no new merge commit. This process is very easy for Git, compared to a true merge: in the end, it's essentially the same amount of work as git checkout.

You can rebase instead of merging

I won't go into great detail here, but instead of merging your work with someone else's work, you can rebase your work on someone else's work. Suppose we start with the same work-that-needs-combining diagram:

...--F--G--H--I--J   <-- master
            \
             K--L   <-- origin/master

Instead of merging, we can have Git copy commits I and J to new, somewhat different commits which we can call I' and J' to remind us that they're a lot like I and J, but not the same. (They'll have new, unique hash IDs, different from your original I and J.) We simply arrange them like this:

...--F--G--H--I--J   <-- master
            \
             K--L   <-- origin/master
                 \
                  I'-J'  <-- ???

Now that we have these copies made, we have our Git "peel the label" master off J and make it point to J' instead:

...--F--G--H--I--J   <-- ???
            \
             K--L   <-- origin/master
                 \
                  I'-J'  <-- master

If you now choose to give up the originals, since you don't need them any more, we can remove the question-mark labels entirely and straighten out one of the kinks in the graph:

...--F--G--H--K--L   <-- origin/master
                  \
                   I'-J'  <-- master

Now it seems as though you wrote your two commits after picking up commit L, instead of writing them based on commit H. The source code that is carried with the two new commits I' and J' is based on what's in L, rather than what's in H.

If you would have gotten merge conflicts when merging, you will almost certainly get merge conflicts when rebasing—and you may get many repeated conflicts (depending on how many commits you made), and they may be harder to resolve than they would be with merging. But in the end, you avoid the merge commit itself, and when you go to look at the history of all development, it will seem simpler, even if it was actually more complicated.

Whether to do this is up to you. If you do choose to do it, though, you do it instead of merging. Remember that rebase works by copying some set of commits; be careful to copy only those commits that you have not already published (with git push), or to be sure that anyone else who had the originals switches away from the originals to the new improved copies.

One final note on git push

We've seen above that fetch (not pull) is the opposite of push. But there is still one more bit of asymmetry. When you run git push, you have your Git hand over your commits to some other Git repository. This all works by hash IDs, just like fetch works by hash IDs—and at the end, just like fetch had to set a name in your repository to remember their latest master, push has to set a name in their repository to remember your last commit.

But—here's the asymmetry—your Git asks their Git to set their master. It's not their bob/master, it's their master. As a general rule, their Git will refuse to allow this operation unless it's a fast-forward.

This fast-forward is the same kind of thing we saw with a fast-forward not-really-a-merge git merge. It means that we are only adding commits to some branch name: the new commit(s) eventually point back to the commits they already had. That's what you are doing when you run git merge or git rebase: you combine your work with their latest, so that your work adds on to theirs, rather than erasing some of theirs.

In the end, Git is all about the commits, but the names—the branch names like master, or your remote-tracking names like origin/master—are all about finding the commits. The names are both for humans (who can't deal with raw hash IDs) and for locating the tip-most commit on a branch. That tip commit points backwards, in the same way that Git always works backwards, to earlier commits, and by doing so, it keeps those earlier commits alive. The name keeps the tip commit alive and the tip commit keeps earlier commits, and that's why git push demands that the push be a fast-forward.

(You can do non-fast-forward pushes using git push --force or equivalent. This has the effect of killing off some commit in the Git that receives the force-push. If you want to kill them off, that's OK—but other Git users may have fetched those commits, so beware: they could come back!)

torek
  • 448,244
  • 59
  • 642
  • 775
0

In the steps you describe, git pull merges remote branch (origin/master) into a local master branch. And as Tim's answer suggests, it creates a merge commit on your local master branch.

To answer your question: both branches are created during the git clone operation. git fetch updates the state of origin/master in the local repository (to the current state on the server).

andrybak
  • 2,129
  • 2
  • 20
  • 40
0

It won't create a new branch. The 2nd user (any user for instance) would have a local copy of master. By running git pull the remote changes are fetched and merged. Merge would fail if there are conflicting changes.

Best way to handle is, for the 2nd user to do:

git pull --rebase 

This would pull the remote changes and apply them on top of his local commit and then can be pushed to remote Master.

0

Yes, git is creating So called "merge commit".

If you would like to avoid this you should go with git pull --rebase each time you pull. This way git will do marge "while pulling" and skip creating "marge commit" .

But you should remember that rebaseoption forces you to handle potential conflicts.

  • `This way git will do marge "while pulling"` -> No. Git *always* do merge while pulling, because `pull` is the same as `fetch` + `merge`. The `--rebase` option tells it to _rebase_ instead of merging. – bfontaine May 27 '18 at 09:17
  • One more thing: When i do a git fetch, how can i see fetched files ? I just see my current files... – Bob5421 May 27 '18 at 09:51
  • @Bob5421 Again, this question seems a bit off kilter to me. When you `git fetch`, you are telling Git to return the latest version of _every_ file. Perhaps you want to know which _new_ files were added? I think Git will already tell you this. – Tim Biegeleisen May 27 '18 at 09:53