0

Can I create a branch from the master, without any of the content held on the master branch being copied over to the new branch? Can I do this, and what is the git command to do it, is it possible?

Kurt
  • 678
  • 1
  • 7
  • 24
lio
  • 419
  • 5
  • 9
  • 1
    Not really. A branch by definition is a modification of current code. But there are work-arounds depending on what you want the branch to mean. You can checkout the first commit and branch from that. That will mean your branch is behind in history. You can checkout master, branch from master, delete all files then commit that branch. That will make the branch current in history – slebetman May 20 '20 at 06:25
  • @slebetman that's what I had done it before, so is there any shorter way to do that ?! – lio May 20 '20 at 06:44
  • slebetman's comment included two possible solutions, which one are you referring to by "that"? – 1615903 May 20 '20 at 07:05

2 Answers2

3

TL;DR

No—but the question as asked ("create ... from the master") forces that answer; we have to interpret "from" in a particular way, which forces the "no". With the right question, the answer becomes "yes". See Creating a new empty branch for a new project or read the long answer below.

Long

Branches are not created from branches. They're created because of, and/or pointing to, commits.

More specifically, a branch name is simply a pointer to one single commit, which Git calls the tip commit. Commits themselves also work as pointers: each commit points backwards, to its parent or parents.

That is, we have some series of commits:

... <-F <-G <-H

that ends in some most-recent commit whose hash ID is H. Commit H points back to its parent G, which points back to F, and so on. The branch name that locates this last commit H just points to H itself:

...--F--G--H   <-- master

When you create a new branch name, the—or at least a—usual method is to use:

git branch <new-name> [<start-point>]

where the optional start-point argument specifies which commit hash ID will go into this new branch name. If you omit the start-point, the hash ID that goes into the new branch name is the hash ID of the current commit. If the current commit is H, and the new name is br, we get:

...--F--G--H   <-- br, master

You can only be "on" one of these branches—with git checkout master or git switch master, for instance—so Git attaches the special name HEAD to one branch name to remember which branch name we're using:

...--F--G--H   <-- br, master (HEAD)

and hence HEAD provides the answers to two separate questions:

  • What branch are we on? (Read HEAD to see where it's connected.)
  • What commit are we on? (Read HEAD, then read the branch name to which it's connected.)

Various other forms of "create a new branch" operation all do the same thing: they create a new name that points to some existing commit.

That existing commit has whatever files it has. The existing commit is already on the current branch, if you use the default of creating from HEAD, or is on whatever set of branches it's on now, if you use the optional starting-point argument. The end result in this case:

...--F--G--H   <-- br, master (HEAD)

is that all of these commits are now on two branches, instead of being on just one branch.

When you're on some branch B and you make a new commit, the new commit gets a new, unique hash ID. So let's say we check out br by name:

...--F--G--H   <-- br (HEAD), master

and then make some changes and commit them to make a new commit I. New commit I has a full snapshot of all files—commits don't hold changes, they hold snapshots—but more importantly, commit I's parent is existing commit H:

...--F--G--H
            \
             I

The name master does not move. We're not on master; we're on br. The name br does move:

...--F--G--H   <-- master
            \
             I   <-- br (HEAD)

Commits up through H continue to be on both branches but new commit I is only on br.

We—or Git, at least—find commits by taking a name like br or master, following its arrow, and getting a commit like I or H. From that commit, Git can follow its backwards-pointing arrow: if we're at H and move back one step, we land at G. That commit has a parent too, so if we follow that arrow, we wind up at commit F, and so on.

If we have Git move the name master, this changes the set of branches on which the existing commits can be found. It does not change any of the existing commits, it just changes the names that can find them. This is what Git is mostly about, really: we make new commits, which hold snapshots and have pointers to their parents. Then we have Git check out some commit, by branch name or hash ID, and we get the old snapshot. Or, we have Git compare any two commits. If we compare parent and child, the difference shows what someone changed. The commits are still snapshots!

It's like saying it was 20˚C out today and 18˚C yesterday, so the difference is 2˚C. We might care about the difference in temperature, or about the actual temperature. Git can do either one—but it stores the actual things, not the differences.

The initial empty repository dilemma

Now, there's obviously a problem in a new, totally-empty repository. A branch name like master must point to some existing, valid commit. But this new, totally-empty repository has no commits. So which commit can master point to?

Git solves this problem by saying that you're on branch master, but branch master doesn't exist. There are no commits, so there is no valid hash ID, so the branch just doesn't exist. That makes it OK: it doesn't exist, so the fact that it can't identify any existing commit is fine.

In other words, it's OK to be on a branch that simply doesn't exist—and you get that state automatically, in any new, totally-empty repository. Then, when you make the very first commit, Git creates the commit and creates the branch name at the same time:

A   <-- master (HEAD)

There is now one commit in the repository, with some big ugly hash ID, but we're calling it A here. The name master now exists and points to existing commit A.

You can recreate this dilemma any time

You can put Git back in this situation any time: just use git checkout --orphan new-branch. Git will put you on a branch name that doesn't exist. git status will tell you that you're on the new branch, while git log will show nothing (sometimes with an error message: the Git authors eventually fixed Git to be smart and just say "no commits yet").

Git makes new commits from Git's index

Newbies to Git often think that Git uses their work-tree files. This is a great source of frustration, because it's just not true. Git will fill in your work-tree, when you ask it to, but that's not what it uses, to make the snapshots for new commits.

When you ask Git to make a new commit:

git commit

Git writes out, as a new snapshot, all of the files that are in Git's index. This name, index, is not a very good name, so this thing now has another name: Git calls it the staging area.1 The files that are in Git's index are the ones that go into the snapshot.

Normally, the index is full of files. It's just that normally, these files also match all the files in the current commit. That is, suppose you're on commit H:

...--G--H   <-- master (HEAD)

You got here by doing git checkout master. That filled Git's files—the index—from commit H, and also filled your work-tree—the files you can see and work with—from commit H.2 So the files in Git's index match those in the HEAD commit. That makes Git say that there is nothing to commit.

You can put files into Git's index using git add, which copies from your work-tree to Git's index. You can take files out of Git's index using git rm, which removes both your work-tree copy and Git's index copy. If you remove all of the files from Git's index (and your work-tree), the index is now truly empty.

If you change any work-tree file, you always have to git add it again. The reason for that is now, finally, simple and obvious: Git isn't using the work-tree copy. For Git to put the updated file in the next commit, you have to first tell Git to copy the work-tree copy back into the index, replacing the existing index copy.3


1Staging area is mostly a better name, but it doesn't cover everything that the index actually does, so I tend to stick with index myself.

2Technically, Git copies first from a commit to its index, then from its index to your work-tree. Up until Git 2.23 and its new git restore command, Git would internally always do it in that two-step kind of process.

3Technically, the index actually holds only the file's name, mode, and a blob hash ID. The actual data—the copy of the file—is stored as an internal Git object. But this is all invisible to you unless and until you start using the git ls-files --stage and git update-index commands, which aren't really meant for normal work.


This finally gets you the answers you need

Suppose we do two things:

  1. git checkout --orphan new-branch
  2. git rm -r . from the top level

Command #1 put us on a new branch new-branch, which does not exist. This is similar to the state in a totally empty repository, when we're on master before master exists, except this time the branch name that doesn't exist, that we're on anyway, is new-branch.

Command #2 removes all the files that are in Git's index, and also removes these same files from your work-tree. This de-clutters your work-tree so that you can see that there are no files now. Note that untracked files won't get removed; to remove those too, run git clean -dfx (clean everything, including empty directories, and do clean .gitignore-ed files).

You now have a clean slate: a truly empty index, and no files. You now can create new files and git add them to copy them into Git's index. When you have the set of files you like, use git commit to create a new commit that has no parent:

...--G--H   <-- master

 I   <-- new-branch (HEAD)

The new commit I does not point back to H this time. The parent of a new commit is what was the current commit, and git checkout --orphan arranged for there to be no current commit. We had a current branch, but the branch didn't exist, and hence there was no current commit.

Note that if you leave out the --orphan in step 1, you will end up with:

...--G--H   <-- master
         \
          I   <-- new-branch (HEAD)

That is, you can go ahead and make the initial setup look like this:

...--G--H   <-- master, new-branch (HEAD)

You can then remove all the files with git rm -r ., create new files, add them to the now-empty index, and commit, and you'll get the new commit with parent H as usual. Commit H will, however, be on the new branch—and Git will now compare the snapshot in commit H, with all of its files, with the snapshot in new commit I, with its independent files, and tell you that the way to turn commit H into commit I involves deleting all those files.

In other words, the difference between these two setups is that with --orphan, the new branch is not connected to the old one at all. The commits on the new branch begin with a new root commit. The histories no longer rejoin at some point in the past (commit H): they are separate, unrelated histories. New commits you add after I continue to be unrelated to commit H:

...--G--H   <-- master

 I--J--K   <-- new-branch (HEAD)

You can, at any time, git checkout master to select existing commit H and branch name master. This will remove from Git's index the files from your current commit K, removing those files from your work-tree as well, and extract into Git's index the snapshot in H and copy those files into your work-tree.

Conclusion

What you need to know is this:

  • History, in Git, is the commits.
  • Git finds the commits by starting from branch names—which find tip commits—and working backwards.
  • Making a new commit consists of checking out a branch by name, so that Git knows which name to update, and then modifying files in Git's index, because Git makes the commits from the index copies of the files.
  • Your work-tree comes along for the ride, when you git checkout a commit, because the files in Git's index are in Git's internal Git-only format, which is only useful to Git, not to you or the rest of your computer software. You will work on the work-tree files, then use git add to copy them back into Git's index, ready for the next commit.
  • New branch names generally already point to existing commits the moment you make them. It's just the one git checkout --orphan case, plus the initial entirely-empty repository, that are special. These make the new name when you make the commit. Until then, you are in a special unborn branch state.
torek
  • 448,244
  • 59
  • 642
  • 775
0

You can stage an empty tree which is a fast way to effectively delete all files when you commit this. git read-tree --empty sets the index to be an empty tree. Committing this leaves the history but makes a commit that deletes everything. If you wanted a separate cleared history then see Creating a new empty branch for a new project

patthoyts
  • 32,320
  • 3
  • 62
  • 93