Use git orphan as folders and keep history

Question

I have a few projects in different repositories that I want to unite under the same repo with different orphan branches. for that, I have created a new repository and starting it inside.

How do I take an existing repo, import it as an orphan branch and keep the history?
Is it possible to work with 2 orphan branches opened as different folders? Let's say that I have 2 orphan branches and I want to work on both of them in parallel, possible? I have much more than 2 and I want to work with single git UI opened so it will be more efficient to develop. Today I check out each repository. after I unite it under the same repo, I still need to work on all branches in parallel.

orphan-branch-B
              \.gitignore
              \.README.md
              \ and more

orphan-branch-B
              \.gitignore
              \.README.md
              \ and more

@einverne thanks, but each submodule is a different repo, I want to avoid that. — SexyMF, Mar 20 '19 at 10:35
If you want to work on all branches in parallel, the best way is to split project into different module. Git can keep all your history, but you still have to frequently check out different branches. — einverne, Mar 20 '19 at 10:45

score 5 · Accepted Answer · answered Mar 20 '19 at 16:27

TL;DR

I have much more than 2 [branches] and I want to work with single git UI opened ...

Whether that's possible, and if so, how, depends on the UI. Asking about Git in general won't get you an answer to that. The command-line answer is to use git worktree (for which Git 2.15 or later is highly desirable).

Long

How do I take an existing repo, import it as an orphan branch and keep the history?

You don't, really. This operation—and question—probably makes no sense, because I doubt you mean orphan branch in the way that Git means it. Read on to the end to decide if it does make sense.

What is a Git repository anyway?

A Git repository consists, in essence, of two databases. One database simply holds Git objects, of which the most interesting one is the one called a commit, and each commit represents a complete snapshot of all files.¹ The other database holds names—branch names like master and tag names like v2.1—and those names are how you, and at least initially Git, will find interesting commits.

Each commit—which, again, represents a snapshot of all files; commits do not contain changes—is uniquely identified by its hash ID. The hash ID is a big ugly string of letters and digits that appears random, but is actually a cryptographic checksum of the entire contents of the commit: the snapshot, plus the metadata that tells you who made the snapshot (name and email address), when (time-stamp), why (log message), and so on. Because each commit stores the actual hash ID of its immediate predecessor or parent commit, it's easy for Git to start with the last commit and work backwards:

... <-F <-G <-H   <-- master

Hence a branch name like master simply holds the hash ID of the last commit in the branch. The history itself is just the chain of commits formed by starting with that commit—H in this example—and working backwards, one commit at a time, from commit to parent.

There's a slight hitch here because chains are not necessarily linear. Having made a sequence of commits like the above, we might also have a second sequence of commits:

          G--H  <-- master
         /
...--E--F
         \
          I--J   <-- develop

Here, commits F and earlier are on both branches, while commits G-H are only on master and I-J are only on develop. If we then merge J into master, we get a commit that's slightly special:

          G--H
         /    \
...--E--F      K  <-- master
         \    /
          I--J   <-- develop

While commit K has a simple snapshot as usual, it now has two parents, not one, making it a merge commit. To view the history from K, we must go back to commits H and J both, at the same time. From there we go back to G and I; from there we go back to F, where the history re-converges, having diverged at the merge.

In other words, Git works backwards: history logically converges at a merge, and since Git works backwards, history actually diverges at a merge. History logically diverges at the point where you spun off a second branch, but in Git it actually converges at that point, because Git works backwards.

What makes a branch name like master special is that it always points to the last commit that we wish to say is on the branch. This is particularly important because you are asking about orphan branches.

¹The other three object types are tree (trees hold file names), blob (each blob is a file's contents), and annotated tag for tags like v2.1. Git uses the commit + tree + blob combination to construct the snapshot that each commit represents.

How Git makes new commits: the index and the work-tree

Is it possible to work with 2 orphan branches opened as different folders?

If you have Git 2.5 or later—with 2.15 or later being a good idea due to some bugs in the initial implementation in Git 2.5—you can use git worktree to work with two different branches at the same time, in two different work-trees. It's now time to talk about Git's index and work-tree notions, after which we'll get to the definition of an orphan branch.

Everything in a Git commit snapshot is frozen forever. No part of any commit—not its log message, not the user name, not the parent hash ID, and no part of any saved file stored as part of that commit—can be changed. Nothing about any existing commit, identified by some existing hash ID, can ever be changed. All of its files are frozen in time. (They're also compressed, sometimes very compressed. You can think of them as being freeze-dried, if you like.) This is great for archival: you can go back in time to any previous commit, any time you want. But it's useless for getting any new work done.

To let you get work done, then, Git gives you the ability to check out a commit. Checking out a commit does three things:

The first and most obvious is that it kind of "re-hydrates" a freeze-dried commit, extracting all of its files to some sort of work area where they have their normal, non-frozen, non-Git-ified form. This work area, which is normally right next to the repository itself, is your work-tree (or working tree, or sometimes working directory or some variant of this kind of spelling.)
The second, also obvious once you think about it, is that if you use git checkout master or git checkout develop or whatever, is that it remembers which branch name you used to get the latest commit from that branch. Or, if you used git checkout <hash-id> to go back in time, it remembers the hash ID. Either way—by branch name or by hash ID—it remembers which commit you have out, too.
The third, mostly-invisible, thing that git checkout does here is to fill in Git's index.

Calling this thing the index is kind of a useless name—what does index convey after all?—so it has two more names as well: it's sometimes called the staging area, or sometimes the cache, depending on who or which part of Git is doing this calling. All three names are for the same single thing, though. What the index is and does gets a little complicated during merges, but the main thing it is and does is that it holds all the files from a commit, in their Git-ified form, ready to freeze, but—unlike a real commit—not actually frozen.

What this means is that the index holds all the files that will go into the next commit. In other words, it's sort of a proposed next commit. You start with:

git checkout master

and for each file that was in the commit identified by the name master, you now have not two but three copies of that file:

HEAD:file is the file stored in the commit. It can't be changed: it's Git-ified, frozen, and read-only. Use git show HEAD:file to see it.
:file is the file stored in the index. It can be changed! It's Git-ified, but you can replace it with a new copy, any time you want. Use git show :file to see it.
file is the file stored in your work-tree. It's an ordinary file, and you can do anything you want with it. Use ordinary (non-Git) commands to see or change it or do whatever you want.

If you've changed some file like file, and you want Git to store the new version in the next commit, you must now update your proposed next commit:

git add file

This copies the work-tree file into the index, overwriting :file with a newly Git-ified copy of your file file from your work-tree.

Hence, the index always contains the proposed next commit. You update this proposal by using git add.

Note that if you git checkout some other branch, you replace the next-commit proposal with a different proposal that matches the commit you just checked out. (There are a few exceptions to this rule, on purpose; see Checkout another branch when there are uncommitted changes on the current branch.) This, in turn, means that the index and the work-tree are really a pair: the index indexes the work-tree. When you make changes to the work-tree, by changing some files around, you need to update your index by git adding those files.

When you run git commit, what Git does is this:

save your name and email address;
save the current time (the timestamp for the new commit);
collect a log message from you, to go into the new commit;
use the current commit's hash ID as a parent hash ID;
save all of this, plus the Git-ified files in the index, into a new commit, which automatically gets a new hash unique hash ID (by computing a cryptographic checksum over all of this data)
write the new commit's hash ID into the current branch

That is, if you had:

...--F--G--H   <-- master

you now have:

...--F--G--H--I   <-- master

The name master now records the hash ID I of the new commit you just made. That new commit has as its parent the hash ID of commit H, the one you had checked out before you made this new commit.

That's how history is formed! Making a new commit, which Git just made from whatever is in the index right now when you ran git commit, creates our new commit I. The new commit's parent is the commit you had Git check out. Because Git made the commit from the index, the index and the new match, just as they did when you first ran git checkout master to get commit H. Everything now looks good for you to modify stuff in the work-tree, use git add to copy it back into the index, and run git commit to make a new J whose parent is I and whose saved snapshot comes from the index.

Making a new branch

Now that you know how existing branches work, let's look at the process of making a new branch. Suppose we start with commit I that you just made on master:

...--F--G--H--I   <-- master

Let's make a new branch named feature/short:

git checkout -b feature/short

What we have now looks like this:

...--F--G--H--I   <-- master, feature/short (HEAD)

That is, both names—both master and feature/short—identify existing commit I. The special name HEAD, which Git uses to remember which branch we're on, is attached to the name feature/short.

Now we'll mess with the work-tree as usual, run git add as usual, and run git commit. Git will collect our name and email and the time, our log message, and so on, and make a new commit J with the snapshot from our index and with parent I. Then it will write J's actual hash ID, whatever that is, into the name feature/short:

...--F--G--H--I   <-- master
               \
                J   <-- feature/short (HEAD)

The history starting at J goes back to I and then H and so on. The new commit is at the tip of the new branch, feature/short. Our index now matches both our commit J and our work-tree, and HEAD remains attached to our branch feature/short.

You now know everything there is to know about branches—well, except for orphan branches, which we'll get to in a moment.

Adding work-trees

If you have been paying close attention, you will have realized by now that not only does the "index" index the work-tree, both it and the work-tree also have an intimate relationship with the special name HEAD. We use git checkout to attach our HEAD to some branch name, and in the process, we fill up our index and our work-tree with everything from one particular commit, the one at the tip of that branch—the commit to which the name points. All of these entities—HEAD, index, work-tree, and branch-name—change simultaneously.

What git worktree add does is to create a new triple—a new <HEAD, index, work-tree> group—and run git checkout in that new group. The new work-tree must reside in a different area in your computer: a different folder, if you like the term folder. The newly added work-tree is on a different branch. All work-trees must be on different branches, even if those branch names identify the same commit! Each work-tree has its own index and HEAD, and if you switch from one work-tree to another, you must change your idea of your HEAD and your index.

The files inside each commit are all freeze-dried: Git-ified and compressed, and not useful. The files extracted into a work-tree are rehydrated and useful. So the ability to add more work-trees means that you can have different commits out at the same time, as long as they're out in different work-trees.

(As a special case, any work-tree can have a detached HEAD where you extract a specific commit by hash ID. So if you need to look at sixteen different historic commits, you can add 16 work-trees, each on a different detached HEAD on that historic commit, for instance.)

Orphan branches

Now that we have all of that out of the way, we can—finally!—look at what an orphan branch is. It's less than you think!

We already know that HEAD is normally attached to some existing branch name, and existing branch names store the hash ID of one single commit, which we call the tip of that branch. When things are set up this way, making a new commit updates the branch name, so that it the existing branch name now stores the new, unique commit hash ID of the new commit we just made.

We've also mentioned, in passing, that HEAD can instead store the hash ID of a commit—Git calls this a detached HEAD. Here HEAD is not attached to a branch name, hence the word "detached". The index and work-tree work in the usual way here: the index holds all the files from the detached-HEAD commit hash ID, in their freeze-dried form but not actually frozen any more, and the work-tree holds all the files from that commit. You can make a new commit this way too: if you do, Git just stores the new commit's hash ID into the name HEAD. No branch name remembers this hash ID. Only HEAD holds that hash ID. These commits are easy to lose by mistake! If you use git checkout to move your HEAD, you've lost the hash ID of the new commits you made—so be at least a little careful with a detached HEAD, so as not to you lose your head. :-)

There's one more mode, though, for HEAD. Git allows you to attach your HEAD to a branch name that doesn't exist. To do that, you use git checkout --orphan:

git checkout --orphan feature/tall

This works a lot like git checkout -b. But -b first creates the branch name, and then attaches HEAD to the branch name. It's the creation of the branch name that stores a hash ID inside the name! When we made feature/short above, we created the name pointing to existing commit I, the same commit that master already remembered.

When we use git checkout --orphan, Git doesn't create the branch name. We end up with a picture like this:

...--F--G--H--I   <-- master
               \
                J   <-- feature/short

feature/tall (HEAD)

The contents of the index and the work-tree remain unchanged, exactly as before, but the name feature/tall does not exist as a branch name at all. It's just that HEAD is attached to it. Since it doesn't exist as a branch name, it doesn't point to any existing commit.

If we make a commit right now, Git will save, as a new snapshot, the contents of the index. If we didn't change anything, those contents match commit J. So we'll get a new commit K. The parent of new commit K is supposed to be whichever commit we have checked out right now—the one identified by the branch name to which our HEAD is attached. But that branch doesn't exist!

What Git does here is to do the same thing it does for the very first commit you make in a new, totally-empty repository that has no commits yet. Git simply makes the commit with no parents at all. Such a commit is called a root commit, and we can draw it like this:

Having made the new commit, Git now updates the branch name to which our HEAD is attached. That name is feature/tall, so now we have:

...--F--G--H--I   <-- master
               \
                J   <-- feature/short

K   <-- feature/tall (HEAD)

The new branch, feature/tall, now exists. It has sprung into existence because we made a new commit—as always, from the index—and that new commit has no history.

History, after all, is just the chain of commits, starting at wherever and working backwards. We start at K and work backwards—well, there's nowhere else to go. So we start at K and show the commit and we're done. End of history! There's nothing else there.

Now, of course, if we start at J or I and work backwards, there's history there. But it's not connected to the history we get starting at K and working backwards. So feature/tall is an orphan branch. It's just an unrelated-to-everything branch.

This peculiar property is very useful in a new, totally-empty repository. Such a repository has no commits and no branches, and the very first commit we make—by creating some files, copying them into our initially-empty index, and committing—should be the first and only commit in this still-new but now not-empty repository. If our HEAD was attached to the branch name master—which of course it was—this creates our first branch name, master, pointing to the first and only commit, which we can call A but which has a unique hash ID that's a cryptographic checksum of the contents of the files we created plus our name plus our email address plus the log message we entered plus the very time when we ran git commit, all of which add up to making this commit unique in the universe.

Using git checkout --orphan sets up similar conditions, except that the index and work-tree are probably not empty. Making the first commit for this orphan branch is what creates the orphan branch. The snapshot that goes in is, as always, whatever is in the index when you run git commit. The log message is whatever you enter. The new commit has no parent, which is why Git calls it an orphan.

Conclusion

If you wanted an orphan commit, this is how you get it. But it has no history, by definition, because history is the chain of parents. If you want an orphan, you get no history; if you want history, you may not use an orphan.

Pheeeeew, I just saw it on my mobile... will read at the weekend (its holiday). thanks for your time. — SexyMF, Mar 21 '19 at 13:19

score 0 · Answer 2 · answered Mar 20 '19 at 10:48

0

You are looking for "git worktree"...

(optionally) create a bare repository; e.g.

mkdir .repo/
git clone --bare .../project .repo/project.git

create the worktrees from within this repository

git -C .repo/project.git worktree add `pwd`/project-A branch-A
git -C .repo/project.git worktree add `pwd`/project-B branch-B

You can skip step 1 and create worktrees from existing, non-bare repositories, but imho it eases operations the branched projects are long living.

answered Mar 20 '19 at 10:48

ensc

6,704
14
22

Ok. but what about using an orphan branch from existing source? can you please also put some more info on the commands? what is `.../project` please add some more explanations.. thanks – SexyMF Mar 20 '19 at 11:20

score 0 · Answer 3 · answered Jun 23 '22 at 00:42

0

You can remove everything, commit that, and start from the tip. Or cherry-pick an orphaned branch on top of it.

git rm -rf .
git commit -m 'remove everything'

answered Jun 23 '22 at 00:42

gamesguru

92
1
3
6

score 0 · Answer 4 · answered Dec 25 '22 at 18:34

git worktree: but what about using an orphan branch from existing source? can you please also put some more info on the commands?

This is not yet supported, but in active discussion/implementation

adding orphan branch functionality (as is present in git checkout) to git-worktree add

Adds support for creating an orphan branch when adding a new worktree. This functionality is equivalent to git checkout's --orphan flag.

The original reason this feature was implemented was to allow a user to initialise a new repository using solely the worktree oriented workflow.

Example usage included below.
$ GIT_DIR=".git" git init --bare
$ git worktree add --orphan master master/

It is listed in the latest "What's cooking in git.git", but not yet officially merged to master.

Use git orphan as folders and keep history

4 Answers4

TL;DR

Long

What is a Git repository anyway?

How Git makes new commits: the index and the work-tree

Making a new branch

Adding work-trees

Orphan branches

Conclusion

adding orphan branch functionality (as is present in `git checkout`) to `git-worktree add`

Use git orphan as folders and keep history

4 Answers4

TL;DR

Long

What is a Git repository anyway?

How Git makes new commits: the index and the work-tree

Making a new branch

Adding work-trees

Orphan branches

Conclusion

adding orphan branch functionality (as is present in git checkout) to git-worktree add

adding orphan branch functionality (as is present in `git checkout`) to `git-worktree add`