4

A university colleague of mine thought it was a good idea to fork a repository by cloning it and copy its contents into a new, freshly initialized repository but without the .git folder from the original repository. Afterwards, he simply committed this copy using a single commit and the whole team began working on the project based on this commit:

A <- B <- C     <- D <- E    (original repository)
\  clone  /        |_____| 
 \       /            |
  \     /     Ofc. work on the original repository was continued after cloning...
   \   /
     M <- N <- O <-P    (our "fork", commits from my team)

Now, my first goal is to get the following repository structure:

A <- B <- C <- N <- O <- P

What I have been trying to do now during the past few hours is the following:

    • Clone the original repository.
    • git diff > /path/to/patch from within the fork.
    • git apply within the original repository.
    • Works, but does not preserve the commits.
  1. Various other things that will not work.
    • Clone the original repository.
    • Create and switch to a new branch.
    • Reset it to the commit A using git reset --hard COMMIT_HASH_A.
    • Create a patch from N <- O <- P using git format-patch COMMIT_HASH_M --stdout > /path/to/patch on the fork.
    • Apply this patch on the original repository using git am -3 /path/to/patch. After resolving several conflicts such as the duplicate creation of empty files, this will result in the following error: fatal: sha1 information is lacking or useless (some_file_name). Repository lacks necessary blobs to fall back on 3-way merge. This is where I cannot get on.

So how do I create a repository including all commits from the original repository and from our team as described, so that eventually, a pull request could be sent to the original repository? Might a git-rebase help?

1' OR 1 --
  • 1,694
  • 1
  • 16
  • 32
  • use grafts and filter-branch. – jthill Sep 12 '16 at 21:01
  • There are more intelligent ways to do it but one fallback would be to use ```git log -n 1 -p``` for each of N, O and P and then apply them one by one to a branch in the original repo's clone with ```patch -p1``` and manually add and remove files as required. If it's less than 50 or so commits, then it might be faster than fighting with doing it the official way. – Art Yerkes Sep 12 '16 at 21:12
  • 1
    I just want to point out that if you don't have the `.git` folder, you don't have a "freshly initialized repository". I assume you mean he copied the files to an empty directory. – Christoffer Hammarström Sep 12 '16 at 22:29
  • What I intended to say is that he copied the files from the original repository without `.git` folder into an empty directory in order to then initialize a new repository in the destination directory. – 1' OR 1 -- Sep 12 '16 at 22:36

3 Answers3

6

If you don't insist on linear history, you can merge your fork into original repository.

In the original repo drirectory:

git remote add fork /path/to/fork
git fetch fork
git merge fork/master

This will preserve commits and may result in liner history (no merge commit) if the merge can be fast-forwarded.

code_dredd
  • 5,915
  • 1
  • 25
  • 53
David Siro
  • 1,826
  • 14
  • 33
  • Thank you very much. Although this seems to be the easiest answer, it is indeed a working one. As mentioned by torek, a `git fetch fork` is still necessary in between. – 1' OR 1 -- Sep 12 '16 at 22:15
  • 2
    This is short and simple, but was incomplete. I've added the missing command for you and gave you my +1, even though my answer is below Still, I don't like the floating `M` commit that results from this. My answer produces linear history. – code_dredd Sep 12 '16 at 22:17
6

TL;DR;

In your original repo clone, you should:

git remote add colleague /path/to/colleague
git fetch colleague
git checkout -b colleague colleague/master
git rebase master
git checkout master
git merge colleague

This will give you linear history and will not leave behind a redundant and parent-less M commit.

This is different from David Siro's answer, which will produce a merge commit that also leaves a redundant/parent-less M commit floating around in the branch you merged from. I don't like that dangling-commit scenario.

Original Post

I replicated your good and bad repository histories and was able to solve the problem by basically rebasing a remote.

These are the steps I followed:

  1. Clone original repository
  2. Add a remote to the bad repo
  3. Fetch the bad repo master branch
  4. Branch into the fetched bad repo
  5. Rebase the bad master branch to your master (will claim some changes are already applied)
  6. Merge this branch into your master
  7. Push back to original repository
  8. Schedule you colleague's demise

With that setup, the commands I used and key output follows.

#
# Step 1
#
$ git clone <path-to-original-repo>
$ cd original-repo

#
# Step 2
#
$ git remote add messed-up-repo <path-to-messed-up-repo>

#
# Step 3
#
$ git fetch messed-up-repo

#
# Step 4
#
$ git checkout -b bad-master bad-orig/master

#
# Step 5
#
$ git rebase master
First, rewinding head to replay your work on top of it...
Applying: commit M
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.
Applying: commit N
Applying: commit O
Applying: commit P

#
# Step 5.1: look at your new history
#
$ git log --oneline --graph --decorate
* cc3121d (HEAD -> bad-master) commit P
* 1144414 commit O
* 7b3851c commit N
* b1dc670 (origin/master, origin/HEAD, master) commit E
* ec9eb4e commit D
* 9c2988f commit C
* 9d35ed6 commit B
* ae9fc2f commit A

#
# Step 6
#
$ git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
$ git merge bad-master 
Updating b1dc670..cc3121d
Fast-forward
 n.txt | 1 +
 o.txt | 1 +
 p.txt | 1 +
 3 files changed, 3 insertions(+)
 create mode 100644 n.txt
 create mode 100644 o.txt
 create mode 100644 p.txt

#
# Step 7
#
$ git push
Counting objects: 9, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (9/9), 714 bytes | 0 bytes/s, done.
Total 9 (delta 3), reused 0 (delta 0)
To /tmp/repotest/good-orig.git
   b1dc670..cc3121d  master -> master

#
# Step 7.1: look at your history again
#
$ git log --oneline --graph --decorate
* cc3121d (HEAD -> master, origin/master, origin/HEAD, bad-master) commit P
* 1144414 commit O
* 7b3851c commit N
* b1dc670 commit E
* ec9eb4e commit D
* 9c2988f commit C
* 9d35ed6 commit B
* ae9fc2f commit A

You can now destroy your colleague's messed up repository with fire and get others to continue using the original, and now fixed, repository.

Note: In your post, you said you wanted commits:

A <- B <- C <- N <- O <- P

But my solution includes commits D and E inbetween: A <- B <- C <- D <- E <- N <- O <- P. If you really want to throw those commits away, i.e. assuming it's not a typo in your post, then you can simply git rebase -i HEAD~5, remove the pick lines for those commits, and then git push --force to your good repo's origin.

I'm assuming you understand the implications of re-writing history and that you need to communicate with your users so that they don't get bit by it.


For the sake of completeness, I replicated your setup as follows:

  1. Create original good repo history: A <- B <- C
  2. Manually copied original contents to messed up repo
  3. Generate messed up commit history: M <- N <- O <- P, where M has the same content as original A <- B <- C
  4. Add work to original repo: ... C <- D <- E
Community
  • 1
  • 1
code_dredd
  • 5,915
  • 1
  • 25
  • 53
2

First, a note: As with all repository-wide "rewrite everything" operatinos, do this on a clone. If it goes well, great! If it fails horribly, remove the clone and you're no worse off than you were before. :-)

As jthill suggested in a comment, you can use grafts, or the more modern git replace, and then use git filter-branch to make the grafting permanent. This assumes that the trees associated with the commits are correct, i.e., that you do not want any changes made to the source associated with each commit (which is probably true). See How do git grafts and replace differ? (Are grafts now deprecated?) and How to prepend the past to a git repository? for more on using grafts and git replace.

Given that the two repositories started from a common base, you can also use the method outlined in David Siro's answer. There is one step missing: after git remote add you must run git fetch to mingle the two repositories into a new "union repository", as it were. I think this method is actually simpler and easier, and would try it first. Once the two repositories have been combined into one, you can rebase, merge, filter-branch, etc., as you like.

Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775