12

I have two git repositories and a lot of untracked changes between them:

   ftp -->            C-- (untracked changes) --D
                     /                           \
   git        A--B--C <-- old/master              \
                                                   \
                                                    \
                                  new/master -->     D--E--F 

How can I merge old repository into new repository to have a linear history like

A--B--C--D--E--F

EDIT:

inspired by How can I combine Git repositories into a linear history?

I've done:

git clone url://new new
cd new/
git remote add old url://old
git fetch old
git reset --hard origin/master
git filter-branch --parent-filter 'sed "s_^\$_-p old/master_"' HEAD
git push origin master

Only problem is that every commit from new/master was doubled (due to change of parent I think) so I've now (M is merge commit)

         D---E---F--         
                    \
A--B--C--D'--E'--F'--M 

How can I easily remove unnecessary commits (D - F and maybe M)?

Community
  • 1
  • 1
ts.
  • 10,510
  • 7
  • 47
  • 73
  • Why not just pull from git repository where you are on F to that where you are on C? – bpoiss Jun 06 '13 at 21:49
  • So the "commits" C and D on ftp aren't really commits in Git except where they appear under old/master and new/master? And how was new/master created? `git init` in a copy of what was in ftp? – andyg0808 Jun 11 '13 at 00:49
  • @andyg0808 Yes, "new" is created by running git init into fresh copy of ftp content. An yes, ftp is not a git branch, it's an old plain ftp. – ts. Jun 11 '13 at 01:10

4 Answers4

6

Fixing the results of git filter-branch

If you have a repository that looks like this:

         D---E---F--
                    \
A--B--C--D'--E'--F'--M <-master

and you want the result to look like this:

A--B--C--D'--E'--F' <-master

then you can simply force master to point to F':

git checkout master
git reset --hard <sha1-of-F'>

This will cause commits D, E, F, and M to become unreachable, effectively deleting them (they will be garbage collected after a while).

Starting over from scratch

Assuming you have two repositories that look like this:

  • old: A--B--C <-master
  • new: D--E--F <-master

and you want the result to be:

  • combined: A--B--C--D'--E'--F' <- master

then you can perform the following steps:

  1. Initialize the combined repository:

    git init combined
    cd combined
    git remote add old url:/to/old
    git remote add new url:/to/new
    git remote update
    

    At this point your combined repository looks like:

    A--B--C <-old/master
    
    D--E--F <-new/master
    

    Note that the two branches aren't connected in any way.

  2. Set your master branch to point to C:

    git reset --hard old/master
    

    Now your repository looks like this:

          old/master
          |
          v
    A--B--C <-master
    
    D--E--F <-new/master
    
  3. Find the sha1 of D:

    d=$(git rev-list --reverse new/master | head -n 1)
    
  4. Import D into your working directory and index by reading the contents of the commit

    git read-tree -u --reset $d
    
  5. commit the contents of D using the same commit message, author, date, etc. as the original D commit:

    git commit -C $d
    

    Now your repository looks like this:

          old/master
          |
          v
    A--B--C--D' <-master
    
    D--E--F <-new/master
    
  6. cherry-pick the rest of the commits:

    git cherry-pick $d..new/master
    

    Now your repository looks like this:

          old/master
          |
          v
    A--B--C--D'--E'--F' <-master
    
    D--E--F <-new/master
    
  7. Clean up:

    git remote rm old
    git remote rm new
    

    Now your repository looks like this:

    A--B--C--D'--E'--F' <-master
    
Richard Hansen
  • 51,690
  • 20
  • 90
  • 97
  • 1
    Wow! That is one advanced and excellent answer. It worked much better then the http://stackoverflow.com/questions/18270041/add-history-to-git-repository-or-merge-git-repositories (which might be a duplicate) – Adam Ryczkowski Oct 10 '14 at 13:01
  • Awesome. One minor thing: the cherry pick changes the creator of the commit, so e.g on GitHub it then shows `Original creator committed with YOU` – inetphantom Mar 14 '18 at 09:46
  • Probably `d=$(git rev-list new/master | tail -n 1)` is easier than first reversing and taking the first sha. – holzkohlengrill Dec 18 '20 at 10:12
4

Just check out your branch and run:

  git reset --hard **SHA-OF-F'**

That will remove M and D-F from your branch.

Chronial
  • 66,706
  • 14
  • 93
  • 99
1

If ftp is not a proper branch and just a copy paste job, this can work for you

cd git
git rm -r .
cp -r ../ftp/. .
git add
git commit
Zombo
  • 1
  • 62
  • 391
  • 407
0

Well I'm not sure what is the exact problem. But what I'd do is to remove one branch that contains the copy. Those commits aren't needed and will probably disappear with the branch being deleted. If they are doubled in the same tree, you can always use git rebase at your own risk.

doing something like

git rebase -i HEAD~5

Will open an editor with things you can do. If you delete a line with a commit, it will delete it from the tree, you can squash commit together etc. If you save an empty file, it will not do anything.

That said, keep in mind that after doing such rebase, the master branch won't be fast forward and you'll have to override the master branch on the server. By forcing a push. In other word, make sure the rebase got right and that after pushing the new branch, everybody will get synced on this branch. If someone had a master and try to push and it fails it might make things worse.

Git rebase is just like replaying the commit history. Deleting a commit is more like skipping a change. If you skip a commit that is doubled then it won't have any effect. But each commit will have a different shasum and it also means that the branch will diverge from the original master.

In any case, before doing anything save the sha1 of the master and then if something goes wrong you can always checkout the master to the commit as long as there is no garbage collect.

Deleting the other branch would be the smartest idea and having every branch coming from this master will prevent doubles. Rebasing should be avoided unless you really have to.

Loïc Faure-Lacroix
  • 13,220
  • 6
  • 67
  • 99