26

I received some source code and decided to use git for it since my co-worker used the mkdir $VERSION etc. approach. While the past of the code currently seems unimportant, I'd still like to put it under git control as well to better understand the development process. So:

What is a convenient way to put those past versions into my already existing git repo? There is currently no remote repo so I don't mind rewriting history, but a solution that takes remote repositories into account will of course be preferred unless it is much more coplicated then. Bonus points for a script which does not need any more interaction based on either a directory or a archive file based history.

Tobias Kienzler
  • 25,759
  • 22
  • 127
  • 221
  • 1
    See also [Edit/amend/modify/change the first/root/initial commit in Git?](http://stackoverflow.com/q/2119480/456814), [Change first commit of project with Git?](http://stackoverflow.com/q/2246208/456814), and [Git: how to add commits before first/initial/root commit?](http://stackoverflow.com/q/16762160/456814). –  Apr 26 '14 at 23:22
  • @Cupcake Thanks, I never saw the notification for your links – Tobias Kienzler Mar 05 '15 at 09:32
  • 1
    Warning: grafts have been removed in Git 2.18 (Q2 2018). See "[What are .git/info/grafts for?](https://stackoverflow.com/a/50517809/6309)". – VonC May 24 '18 at 20:38

4 Answers4

27

For importing the old snapshots, you find some of the tools in Git's contrib/fast-import directory useful. Or, if you already have each old snapshot in a directory, you might do something like this:

# Assumes the v* glob will sort in the right order
# (i.e. zero padded, fixed width numeric fields)
# For v1, v2, v10, v11, ... you might try:
#     v{1..23}     (1 through 23)
#     v?{,?}       (v+one character, then v+two characters)
#     v?{,?{,?}}   (v+{one,two,three} characters)
#     $(ls -v v*)  (GNU ls has "version sorting")
# Or, just list them directly: ``for d in foo bar baz quux; do''
(git init import)
for d in v*; do
    if mv import/.git "$d/"; then
        (cd "$d" && git add --all && git commit -m"pre-Git snapshot $d")
        mv "$d/.git" import/
    fi
done
(cd import && git checkout HEAD -- .)

Then fetch the old history into your working repository:

cd work && git fetch ../import master:old-history

Once you have both the old history and your Git-based history in the same repository, you have a couple of options for the prepend operation: grafts and replacements.

Grafts are a per-repository mechanism to (possibly temporarily) edit the parentage of various existing commits. Grafts are controlled by the $GIT_DIR/info/grafts file (described under “info/grafts” of the gitrepository-layout manpage).

INITIAL_SHA1=$(git rev-list --reverse master | head -1)
TIP_OF_OLD_HISTORY_SHA1=$(git rev-parse old-history)
echo $INITIAL_SHA1 $TIP_OF_OLD_HISTORY_SHA1 >> .git/info/grafts

With the graft in place (the original initial commit did not have any parents, the graft gave it one parent), you can use all the normal Git tools to search through and view the extended history (e.g. git log should now show you the old history after your commits).

The main problem with grafts is that they are limited to your repository. But, if you decide that they should be a permanent part of the history, you can use git filter-branch to make them so (make a tar/zip backup of your .git dir first; git filter-branch will save original refs, but sometime it is just easier to use a plain backup).

git filter-branch --tag-name-filter cat -- --all
rm .git/info/grafts

The replacement mechanism is newer (Git 1.6.5+), but they can be disabled on a per-command basis (git --no-replace-objects …) and they can pushed for easier sharing. Replacement works on individual objects (blobs, trees, commits, or annotated tags), so the mechanism is also more general. The replace mechanism is documented in the git replace manpage. Due to the generality, the “prepending” setup is a little more involved (we have to create a new commit instead of just naming the new parent):

# the last commit of old history branch
oldhead=$(git rev-parse --verify old-history)
# the initial commit of current branch
newinit=$(git rev-list master | tail -n 1)
# create a fake commit based on $newinit, but with a parent
# (note: at this point, $oldhead must be a full commit ID)
newfake=$(git cat-file commit "$newinit" \
        | sed "/^tree [0-9a-f]\+\$/aparent $oldhead" \
        | git hash-object -t commit -w --stdin)
# replace the initial commit with the fake one
git replace -f "$newinit" "$newfake"

Sharing this replacement is not automatic. You have to push part of (or all of) refs/replace to share the replacement.

git push some-remote 'refs/replace/*'

If you decide to make the replacement permanent, use git filter-branch (same as with grafts; make a tar/zip backup of your .git directory first):

git filter-branch --tag-name-filter cat -- --all
git replace -d $INITIAL_SHA1
Tobias Kienzler
  • 25,759
  • 22
  • 127
  • 221
Chris Johnsen
  • 214,407
  • 26
  • 209
  • 186
  • thanks, this works great for a small test subset, now off to the complete one :) (I used the replacement option) – Tobias Kienzler Jun 30 '10 at 11:04
  • This is not an issue for me at the moment, but I'll ask anyway: Using the replace-option up to the point before `git filter-branch`ing does not rewrite history and is therefore easier to share, right? – Tobias Kienzler Jun 30 '10 at 11:20
  • 2
    Without *git filter branch*, neither grafts, nor replacements actually rewrite history (they just produce an effect on the commit DAG as if they had rewritten history). The benefits of replacements are 1) they can be disabled by command line argument or environment variable, 2) they can be pushed/fetched, 3) they work on any object, not just the parent “attritubes” of commits. The ability to push replacements makes them easy to share via the normal Git protocols (you can share graft entries, but you have to use some "out of band" mechanism (i.e. not push/fetch) to propagate them). – Chris Johnsen Jun 30 '10 at 11:41
  • @Chris I just noticed that a file from the old version which I did not possess and therefore is not in my history got deleted, is it possible to undelete the file? Basically I search for the inversion of [How do I remove sensitive files from git’s history](http://stackoverflow.com/questions/872565/how-do-i-remove-sensitive-files-from-gits-history). Sidenote: using grafts, the deletion occurs at the original initial commit, using replace at the second original commit... – Tobias Kienzler Jun 30 '10 at 14:39
  • I asked this as a separate question ( [How to undelete a file previously deleted in git’s history?](http://stackoverflow.com/questions/3150394/how-to-undelete-a-file-previously-deleted-in-gits-history) ), just in case someone else wants to know – Tobias Kienzler Jun 30 '10 at 14:51
  • I just can't stop asking more questions... But [Can tags be automatically moved after a git rebase?](http://stackoverflow.com/questions/3150685/can-tags-be-automatically-moved-after-a-git-rebase) Rewriting history worked fine, but now my tags are on another timeline... – Tobias Kienzler Jun 30 '10 at 15:23
  • You can rewrite the tags with: `git filter-branch --tag-name-filter cat --original refs/original-tags-too -- --all`, but that will not completely do what you want if you have also done a [rebase in the interim](http://stackoverflow.com/questions/3150394/how-to-undelete-a-file-previously-deleted-in-gits-history/3150528#3150528) (it will only move the tags to the post-filter-branch commits, not to the post-rebase commits). I would suggest identifying and fixing the cause of the missing file in the original commits and then re-doing the replace/graft+filter-branch (this time also filtering tags). – Chris Johnsen Jun 30 '10 at 19:38
  • 1
    Reading your “undelete” question, I see the file in question was in the snapshots, but not in your Git history. If you have tags to bits of your Git history, this is what I would do: start with your original Git repository (before any filtering or rewriting; see `refs/original/` if you do not have a plain backup/clone), use `filter-branch --tag-name-filter cat --tree-filter … -- --all` (or `--index-filter`) to add the file to your history while rewriting its tags, then do the graft/replace and `git filter-branch --tag-name-filter cat -- --all` to permanently establish the graft/replacement. – Chris Johnsen Jun 30 '10 at 20:00
  • Thanks, this is just great! FWIW, you can easily push graft “suggestions” for past history as comments in a README file or, unobtrusively, .gitignore, then record the past part as merged, if you can’t publish a rebased streamlined history (which is not suggested anyway). – mirabilos Jul 24 '13 at 14:50
  • Strongly related post by same author: https://stackoverflow.com/a/3811217/321973, and this post has been credited by https://developer.atlassian.com/blog/2015/08/grafting-earlier-history-with-git/ – Tobias Kienzler Feb 09 '17 at 12:49
  • 2
    Warning: grafts have been removed in Git 2.18 (Q2 2018). See "[What are .git/info/grafts for?](https://stackoverflow.com/a/50517809/6309)". – VonC May 24 '18 at 20:38
  • @VonC Good to know, thanks! But to be more precise, aren't they replaced by `git replace --graft`? – Tobias Kienzler May 26 '18 at 09:09
  • @tobias Yes, that is what my answer illustrates. – VonC May 26 '18 at 09:10
  • @VonC Ah yes, perfect. Your comment made it sound like there'd be no alternative – Tobias Kienzler May 26 '18 at 09:12
3

If you don't want to change the commits in your repository, you can use grafts to override the parent information for a commit. This is what the Linux Kernel repo does to get history from before they started using Git.

This message: http://marc.info/?l=git&m=119636089519572 seems to have the best documentation that I can find.

You'd create a sequence of commits relating to your pre-git history, then use the .git/info/grafts file to make Git use the last commit in that sequence as the parent of the first commit you generated using Git.

Andrew Aylett
  • 39,182
  • 5
  • 68
  • 95
  • 1
    +1 ah yes, I see, thank you. This is detailed as the graft-option in [Chris Johnsen's answer](http://stackoverflow.com/questions/3147097/how-to-prepend-the-past-to-a-git-repository/3148117#3148117) – Tobias Kienzler Jun 30 '10 at 11:11
2

The easiest approach is of course creating a new git repo, commiting the history to prepend first and then reapplying the patches of the old repo. But I'd prefer a solution which is less time consuming by automation.

Tobias Kienzler
  • 25,759
  • 22
  • 127
  • 221
0

If you just want to permanently merge 2 repositories, the best solution is to export all commits from the second repository (except the initial commit, which created the repository as a continuation of the other).

I think this is the best because when you do the steps as explained by Chris Johnsen, it will convert your initial commit on the second repository as deletion commit which deletes several files. And if you try to skip the initial commit, it will convert the your second commit, into a commit which deletes all files (of course, I had to try it). I am not sure how it affect the ability of git to track the file history in command as git log --follow -- file/name.txt

You can export the whole history (expect the first commit, which is already present on the first repository) of your second repository and import it on the first repository running these commands:

  1. Open a Linux command line on your second repository (to export the latest commits)
  2. commit_count=$(git rev-list HEAD --count)
  3. git format-patch --full-index -$(($commit_count - 1))
  4. Move all git patches .patch files created on the root of your second repository to a new directory called patches on the side of your first repository root directory
  5. Now, open a Linux command line on your first repository (to import the latest commits)
  6. git am ../patches/*.patch
  7. If you got problems while applying git patches, run git am --abort, then, see git: patch does not apply and try something like git am ../patches/*.patch --ignore-space-change --ignore-whitespace as suggested on the linked StackOverflow question.

Alternatively to using git from command line, you can use a git interface like SmartGit or GitExtensions

References:

  1. https://www.ivankristianto.com/create-patch-files-from-multiple-commits-in-git/
  2. Git: How to create patches for a merge?
  3. https://www.ivankristianto.com/create-patch-files-from-multiple-commits-in-git/
  4. how to apply multiple git patches in one shot
  5. https://davidwalsh.name/git-export-patch

For completeness, here I present a automated shell script which follows Chris Johnsen steps to permanently merge 2 repository. You need to run this on the first repository, where you would like to integrate the history from the second repository, which continues the history from the first repository. After a few hours of experimentation, I found this to be the best approach. If you know how you improve something, please, fix/share/comment.

Please, fully backup your both first and second repositories to .zip file before running this.

old_history=master
new_history=master-temp

old_remote_name=deathaxe
old_remote_url=second_remote_url

git remote add $old_remote_name $old_remote_url
git fetch $old_remote_name
git branch --no-track $new_history refs/remotes/$old_remote_name/$old_history
git branch --set-upstream-to=origin/$old_history $new_history

# the last commit of old history branch
oldhead=$(git rev-parse --verify $old_history)

# the initial commit of current branch
# newinit=$(git rev-list $new_history | tail -n 2 | head -n -1)
newinit=$(git rev-list $new_history | tail -n 1)

# create a fake commit based on $newinit, but with a parent
# (note: at this point, $oldhead must be a full commit ID)
newfake=$(git cat-file commit "$newinit" \
        | sed "/^tree [0-9a-f]\+\$/aparent $oldhead" \
        | git hash-object -t commit -w --stdin)

# replace the initial commit with the fake one
# git replace <last commit> <first commit>
# git replace <object> <replacement>
git replace -f "$newinit" "$newfake"

# If you decide to make the replacement permanent, use git filter-branch
# (make a tar/zip backup of your .git directory first)
git filter-branch --tag-name-filter cat -- --all
git replace -d $newinit

git push -f --tags
git push -f origin $new_history

git checkout $old_history
git branch -d $new_history
git pull --rebase

References:

  1. https://feeding.cloud.geek.nz/posts/combining-multiple-commits-into-one/
  2. https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-replace.html
  3. Remove the last line from a file in Bash
  4. Force "git push" to overwrite remote files
  5. Git force push tag when the tag already exists on remote
Evandro Coan
  • 8,560
  • 11
  • 83
  • 144