2

When using Git, I have always kept a local master branch. Before doing anything with that branch, I would always do a git pull --ff-only.

I have recently been told that this is a terrible idea and that I should not keep a duplicate master branch. I took a few minutes to research this and couldn't find anything on the subject. I would like to know not only which way is preferred, but the reasons why.

As far as I understand, if I do not have a local master branch then I cannot branch off of it while offline. Also, what if I do not have any local feature branches (like starting a new project or cleaning up finished tasks)? Without a master, where would I put the HEAD?

I appreciate all and any input on the subject to help me understand more on the best practices and why it is best to follow that way.

EDIT: The person who told me this explained that I should only ever be rebasing and branching off of origin/master instead of pulling origin/master to my local master branch. I asked this question to get more feedback on the workflow and the pros/cons of each.

EDIT 2: I understand that master is just the name of a branch, and the branches are basically just names that are put onto the commit tree. My question regards to workflow. Is it best practice to use a local master branch, which always mirrors the origin/master branch? Or should I always branch off of origin/master and rebase against origin/master without ever touching my local master branch?

Steven Rogers
  • 1,874
  • 5
  • 25
  • 48
  • 2
    Who told you that? – jonrsharpe May 24 '17 at 21:12
  • @jonrsharpe I'm not going to name him, just that he has been in the software industry much much longer than I have so I believe he has experience and reason to say such a thing. – Steven Rogers May 24 '17 at 21:17
  • 1
    Did they explain why? Did you ask *them?* Having local copies of the remote is how git *works*, so it seems like there's a communication issue around what you're doing that we can't really solve here. – jonrsharpe May 24 '17 at 21:20
  • 1
    There is no special meaning of `master` branch, except it's the default branch git created for you. The actual meaning of `master` varies from project to project. In the context where `master` means the active developing branch, it's always better to pull often, since that will make your life easier when resolving conflicts. – Yang Yang May 24 '17 at 21:21
  • I work with a team of other developers, and to us the 'master' branch on the repository is the production branch that is stable. I have a local branch under the same name that I don't make commits on, I only use it to pull down other's changes and then branch off of that. @jonrsharpe they explained that the local master can get mixed up and I should only be rebasing against the origin/master and delete my local master branch. I initially disagreed with that individual, and created this question to gather more pros/cons to both sides. – Steven Rogers May 24 '17 at 21:34
  • 1
    FYI: https://stackoverflow.com/questions/18137175/in-git-what-is-the-difference-between-origin-master-vs-origin-master. If this is the way the team has agreed to work, what are you hoping to get out of this? – jonrsharpe May 24 '17 at 21:40
  • @jonrsharpe I'm hoping to learn more about best practices. I know I don't know everything, I don't know what I don't know so I ask this question on what is the best thing to do. Or at least opinions and pros/cons. – Steven Rogers May 24 '17 at 22:17

3 Answers3

4

There's no strong reason to have your own name master, and no strong reason not to have your own name master. It comes down to what you want, and what you want to deal with.

What's important to know here is that at one level, Git doesn't really care about branch names at all. (At other points, it does, but we should look first at this level where it doesn't.)

In Git, a branch name is just a moveable label, like one of those yellow sticky notes (or the "sign here" label forms of them), that you paste onto a commit. It's the commits themselves that make up the actual branches.

Branches without names

Suppose we start with a repository with just one commit in it. Call that commit A (instead of its actual incomprehensible and unpronounceable hash ID):

A

Now we add a new commit, B, whose parent is A:

A <-B

We say that commit B "points to" A. In other words, it's commit B itself that remembers my previous commit is A.

When we add new commit C, that commit points back to B:

A <-B <-C

(What about A, where does it point? Nowhere: it has no parent! It can't, because it was the first commit. Technically A is a root commit. You will see Git print this phrase, root commit, when you make the first commit in a new repository.)

All these internal arrows are a pain to draw. We know they always point backwards: they have to, because commits can only remember their parents, not any children that don't exist yet ... and once Git makes a commit, it can never change anything in that commit. So it can't add a list of children later. As a result, Git always wants to work backwards, from the newest commits to the oldest. So let's draw them without arrows:

A--B--C

If we want to make a branch, we just pick some commit somewhere and use it as the parent of a new commit. Let's make a new commit D whose parent is B:

A--B--C
    \
     D

and now we have a branch!

Branch names

The problem Git has is that it can't find these commits quickly. This is where branch names come in. We make a branch name, like master, and make it point to the tip commit of the underlying branch:

A--B--C   <-- master
    \
     D

We need another name for commit D; let's call it develop for now:

A--B--C   <-- master
    \
     D   <-- develop

Now, the thing that makes branch names like master special is that we can get "on" them, using git checkout:

$ git checkout master
Switched to branch 'master'
$ git checkout develop
Switched to branch 'develop'

We need a way to remember which branch we're "on". Git uses HEAD for this:

A--B--C   <-- master
    \
     D   <-- develop (HEAD)

When you git checkout a branch, Git checks out the tip commit of the branch, and makes HEAD remember that particular branch-name.

Here's the other special feature: when you make a new commit, its parent is the current (HEAD) commit, and then Git reads HEAD to see which branch it names, and moves the branch name. Let's make a new commit E on develop by changing some files, using git add, and git commit:

A--B--C   <-- master
    \
     D--E   <-- develop (HEAD)

New commit E points back to D. The name develop now points to E. The name HEAD still refers to develop, but the current commit is now E.

Protecting (or keeping) commits

I mentioned above that Git has a hard time finding commits without the names. But these names do not just make it easy to find the commit they point to. They also serve to protect these commits. Git has a maintenance command, git gc or the Garbage Collector, that does a slow and painful crawl to find every commit (and other object) in the repository, and check whether they have names. If not, git gc can collect them up as trash and remove them.

Thus, the existence of a name tells Git that this commit matters: Git should keep it. If this commit matters, then its parent commit also matters. That parent commit's parent matters too, and so on all the way back to the root commit. So if master points to C, Git has to keep C (and then also B and A). If develop points to E, Git has to keep E (and then also D and B and A).

But branch names are not the only names we have. We also have what Git calls remote-tracking branches, like origin/master. (And we have tags, and some special references like stash, and Git's "notes", and so on. The general term here is references, though you don't need to remember that.)

When you git clone a repository from somewhere, Git talks to another Git. That other Git has its own branches (and all its other references, including tags and remote-tracking branches). Your Git gets their list of branches (and tags; your Git normally ignores their remote-tracking branches here). Your Git then renames all their branches. So if we clone this repository, with its five commits, we get a new copy that looks like this:

A--B--C   <-- origin/master
    \
     D--E   <-- origin/develop

These origin/ names are remote-tracking branches (not regular, ordinary, local branches; you cannot get "on" them). These serve just as well as local branches to protect the commits, and to let Git find them. You just can't get on them. (If you try, you get what Git calls a "detached HEAD" instead.)

Your Git then creates at least one local branch name, usually master, using one of these remote-tracking branch names, usually origin/master, and gets you onto that branch:

A--B--C   <-- master (HEAD), origin/master
    \
     D--E   <-- origin/develop

Note that nothing happened to any of the commits. We've just added a new label, master, pointing to commit C, just like origin/master does. (And then we had Git set HEAD to master to remember that this is the branch we're "on".)

(Like branch names, tag names also get copied over. However, unlike branch names, tag names don't get origin/ shoved in front. So that's one of several things that makes tags different from branches. Like remote-tracking branch names, you can't get "on" a tag either, and as a rule, tag names should not move the way branch names do.)

Using git fetch to add to your repository

This is all fine and well for the original git clone, but eventually you probably want to pick up new commits that someone else added to the repository you cloned. You do this by running git fetch. (If you run git pull, be aware that it just runs git fetch, and then runs a second Git command. So you're still using git fetch.)

What git fetch does is go back to the other Git at origin and get, from it, its current branch names and their commit hash IDs. Since hash IDs are guaranteed to be unique across all these sharing repositories, your Git can tell if those are new commits, or not. Your Git then asks their Git for the new commits (and their parents and grandparents and so on, as needed, to get back to the point where you are talking about commits you already have). Let's see what happens as we bring in two new commits from their master:

        F--G   <-- origin/master
       /
A--B--C   <-- master (HEAD)
    \
     D--E   <-- origin/develop

Here F's parent commit is C. We already had C so our Git didn't have to bring that one in. But our Git did bring in their G, which required their F. And, our Git saw that their master now names commit G. So our Git updates our memory—our origin/master—to point to G too.

Now our master is behind, and we need to do something to make it catch up. Or, we could just delete it, as long as we stop using it as HEAD. For instance, we could git checkout -b develop origin/develop to make a new local develop based on origin/develop, and move our HEAD there:

        F--G   <-- origin/master
       /
A--B--C   <-- master
    \
     D--E   <-- develop (HEAD), origin/develop

Again, nothing happens with any commits: this is all name-shuffling. (Well, our index and work-tree get filled-in from commit E, too.)

We can update our master to match theirs:

        F--G   <-- master, origin/master
       /
A--B--C
    \
     D--E   <-- develop (HEAD), origin/develop

and now we can straighten out the kink in the graph (draw A--B--C--F--G in one big straight line). Or we can delete our name master and not have to drag it around any more, and likewise straighten out the drawing.

Nothing really changes as a result: we just have, or don't have, the name master pointing to some commit. If we do have it, we must decide whether to update it. If we don't have it, we don't have to decide anything. Those are your reasons to have, or not to have, master: so that you can remember where it was, and drag it around if you like to do that, or so that you don't have it and don't have to drag it around if you don't like to do that.

(If you do keep the name, you can tell just what's come in since the last time you dragged it forward. If you don't keep the name ... well, Git has reflogs that save previous values of references, including remote-tracking branches, so you can pretty much do the same thing, except that reflog entries eventually expire.)

torek
  • 448,244
  • 59
  • 642
  • 775
  • I sincerely appreciate this answer! Branches in general make a whole lot more sense. However, I may have been terrible at wording the question, I wanted to know if it was good practice to pull the changes from origin/master to my local master. I understand it's just a branch name. It could just as easily be called `production`, `core`, or even `foobar`. Like I said, is it best to pull into my master? or branch and rebase off of origin/master without ever having/updating a local master branch? – Steven Rogers May 24 '17 at 22:23
  • 1
    That's both the first and last part of the above answer: *it's up to you*. Which flavor of ice cream is the best? Use the method that works *for you*, they're all equivalent otherwise. – torek May 24 '17 at 22:31
  • Good point, sorry I didn't realize that. So what you're saying is that both workflows are so identical under the hood that it doesn't really matter which one I choose as long as I understand what's going on and it works for me? – Steven Rogers May 24 '17 at 22:35
  • Yes. There are a few more things to know about branch names (that they let you configure items such as the *upstream*, whether pull—which I recommend avoiding anyway—runs rebase, and so on), but you can leave that for later. – torek May 24 '17 at 22:42
  • Probably the "big deal" here is to realize that people (even if working on the same project) don't have to behave the same _on their repos_. Your repo is your turf and you get to choose what you will keep there and how you will do it (and keeping a local master branch can be seen as part of that). Sure, when you start interacting with other people you have to start "behaving" but keeping master if so you decide to do is up to you, as long as you understand what's going on. – eftshift0 May 25 '17 at 01:12
1

master is just a name for a branch... it could be anything. As long as you understand that when you work on your "local" master branch, there's no movement of any other branch with the same name on other repositories, you will be fine.

I remember back in the day that I was reading a tutorial on how to build modules for the linux kernel and the writer ordered people to get out of "master" because that's the branch where linus works. It's ok to stay on top of master as long as you understand that by you committing on it, linus' branch is not moving along with it. And of course, it makes sense to name your branch something appropriate for whatever you are doing, but it's not like you will make the world go crazy just because you choose to stay on top of your local master branch.

eftshift0
  • 26,375
  • 3
  • 36
  • 60
  • I work with a team of other developers, and to us the 'master' branch on the repository is the production branch that is stable. I have a local branch under the same name that I don't make commits on, I only use it to pull down other's changes and then branch off of that. – Steven Rogers May 24 '17 at 21:33
  • Sounds like a no-brainer then. No need to remove it from your local. – eftshift0 May 24 '17 at 21:38
  • I asked this question because I was told that was a terrible workflow. I was told that having a local `master` branch that always mirrors the `origin/master` branch could cause issues. I wanted to know more about the opinion of others on this subject, which way is better, and why. – Steven Rogers May 24 '17 at 22:27
0

I'd say no, you shouldn't. Most probably you Git workflow will forbid any commits tо the master branch on the server, so there is no point in having it's local copy: you won't be able to push it to the corresponding remote master. However you'll need to sync it with the remote master from time to time: extra effort with no pros.

Every time you need a master just use origin/master (assuming your remote is origin). E.g. starting a new branch from master:

git checkout -b new_branch origin/master

Rebasing you branch on master:

git rebase origin/master existing_branch

Building a latest master locally (you will appear in a detached HEAD state):

git checkout origin/master && make build
Borislav Ivanov
  • 4,684
  • 3
  • 31
  • 55
madhead
  • 31,729
  • 16
  • 153
  • 201