15

I have a git repo with several directories, and a single file, MyFile.ext.

/
  LargeDir1/
  LargeDir2/
  LargeDir3/
      .
      .
      .
  MyFile.ext

I'd like to start a new repo with just MyFile.ext in it, and keep all the history pertaining to it, but ignore everything else (all the LargeDirs). How can I do this?

For directories, I've successfully used this answer, but I tried that on a single file, and it doesn't work.

I've also tried this answer, which does delete everything except my file, but it also seems to leave all the history around.

Community
  • 1
  • 1
Kris Harper
  • 5,672
  • 8
  • 51
  • 96
  • see if you can 'git mv' the file into a sub directory, then use 'git subtree'. – Gregg Sep 13 '16 at 21:30
  • @Gregg I will try it, but I'm pretty sure it won't work because when you move a file, you lose the git history. I've had issues with that before when using `git subtree split` on a renamed directory. – Kris Harper Sep 13 '16 at 21:35
  • 1
    @Gregg Yeah just tried it. The only commit that comes into the new repo is the commit where the file was moved to the new directory. – Kris Harper Sep 13 '16 at 22:01
  • I have this exact same problem and have tried both `git subtree split ...` and `git filter-branch ...` solutions without success. Those basically only work for subdirectories where everything in it was never altered outside that directory. What I want is a commit & log history that is what you see when you run `git log MyFile.ext`. – hepcat72 Mar 16 '17 at 19:33
  • @hepcat72 Yeah, I don't know if it's possible. I eventually gave up and lost the history. – Kris Harper Mar 16 '17 at 20:08
  • Actually, I just figured out a (rather labrious but working) solution. I was just compiling a set of steps for the solution, but it's based on the accepted solution to: http://stackoverflow.com/questions/16930919/move-some-git-commits-into-a-new-repo – hepcat72 Mar 16 '17 at 20:10

4 Answers4

24

Use git fast-export.

First you export the history of the file to a fast-import stream. Make sure you do this on the master branch.

cd oldrepo
git fast-export HEAD -- MyFile.ext >../myfile.fi

Then you create a new repo and import.

cd ..
mkdir newrepo
cd newrepo
git init
git fast-import <../myfile.fi
git checkout
Roland Smith
  • 42,427
  • 3
  • 64
  • 94
  • I just tried this, but the file doesn't appear in the new repo. – Kris Harper Mar 16 '17 at 22:46
  • @KrisHarper `fast-import` creates the commits, but not a working copy. See updated answer. – Roland Smith Mar 16 '17 at 23:08
  • I get "fatal: You are on a branch yet to be born". – Kris Harper Mar 16 '17 at 23:23
  • What version of `git` are you using? This works fine with git 2.11. One some older versions of git, you might have to create an initial commit first to create the `master` branch. – Roland Smith Mar 16 '17 at 23:34
  • Git version 2.7.4 – Kris Harper Mar 16 '17 at 23:35
  • What branch were you on when you ran `fast-export`? In my tests I was on `master`, and I can confirm this works. – Roland Smith Mar 16 '17 at 23:42
  • Ah, you are correct. I was on a branch other than master. – Kris Harper Mar 17 '17 at 00:01
  • Excellent. Much simpler than what I figured out, and cleaner. Wish you'd answered this 2 days ago before I spent time trying to roll my own solution. – hepcat72 Mar 17 '17 at 14:51
  • @hepcat72 It was actually your answer that made me see this question. :-) When I originally [ran into this problem](http://rsmith.home.xs4all.nl/howto/making-a-subset-of-a-git-repository.html) in 2015, the first thing that I've tried was [reposurgeon](http://www.catb.org/esr/reposurgeon/). That pointed me to *fast-import streams*, and that lead me to `git fast-export` I'm not sure where I found this exact usage; it's not in the manual page. The point here is that it's not blindingly obvious that this is a good way to do such things. – Roland Smith Mar 17 '17 at 15:25
  • 2
    For future reference: one can add multiple filenames or even directory names instead of `MyFile.ext`, very useful for splitting a repo in several pieces. – Jasper Jul 27 '17 at 08:51
  • @KrisHarper You have to `git checkout `. – Paweł Bylica Dec 22 '17 at 14:09
0

I had this same issue and I finally figured it out. I had an old old directory of scripts - so old, they had originally been under RCS control. Years ago, I made it into a git repo (without really knowing what I was doing) and I converted the RCS log and update the git log. But I picked up development of one of the scripts and decided it needed its own repo. The various solutions out there (subtree and filter-branch) depend on the part you're splitting out to be a directory. You can put the file in a directory and split it out that way, but you don't get the revision history with it. So here's how I figured out how to extract the revision history of a single file and create a new repo with it:

  1. Create a branch new repo [I did it at the same level as the source-repo]

    git init <new-repo>
    
  2. Now go into your source repo and create a file that we're going to use later to cherry-pick the file's commits:

    cd <source-repo>
    git log --reverse <target-file.ext> | \
        grep ^commit | cut -d ' ' -f 2 | cut -c 1-7 | \
        perl -ne 'print("pick $_")' > ../commits-to-keep.txt
    
  3. Create a temporary branch and push it to your new repo (then delete it)

    git checkout -b tmpbranch
    git push ../new-repo tmpbranch
    git checkout master
    git branch -d tmpbranch
    
  4. Now go to your new repo and create an empty commit off of which we will rebase:

     cd ../<new-repo>
     git commit --allow-empty -m 'root commit'
     git rebase --onto master --root tmpbranch -i
    
  5. [The only manual step] In the editor that comes up from the last command above, remove all the contents and paste in the contents of the file you created earlier: ../commits-to-keep.txt

  6. Now you can switch back to the master branch, merge, and then clean up the temporary branch:

     git checkout master
     git merge tmpbranch
     git branch -d  tmpbranch
    

The only drawback here is that you end up with the extra empty root commit. I found that there are ways to remove it, but for my purposes, this was good enough.

hepcat72
  • 890
  • 4
  • 22
  • 1
    Even though @RolandSmith's answer is the way to go, I thought I'd add that my manual step can be changed to be automatic by inserting this above the rebase command: `setenv GIT_EDITOR 'vim +"%d | r ../commits-to-keep.txt | wq"'`. This would make it possible to skip step 5. Though you might not want to lose what already might be in `GIT_EDITOR`. In lieu of @RolandSmith's answer, I thought about taking my answer down, but I figured I'd leave it up just to show another way to sort of do it. – hepcat72 Mar 17 '17 at 15:01
0
  1. Clone the repo.
  2. Filter out everything but that one file.

Cloning can be done normally with git clone. That will work fine on a directory like git clone /path/to/the/repo. Then remove remote pointing back to the clone.

git clone /path/to/the/repo
git remote rm origin

Then use git filter-branch to filter out everything but that one file. This is easiest to accomplish with an index filter that deletes all files and then restores just the one.

git rm --cached -qr -- . && git reset -q $GIT_COMMIT -- YOURFILENAME

An index filter works by checking out each individual commit with all the changes staged. You're running this command, and then recommitting it. It first removes all the changes from staging, then restores that one file to its state in that commit. $GIT_COMMIT is the commit being rewritten. YOURFILENAME is the file you want to keep.

If you're doing all branches and tags with --all, add a tag filter which ensures the tags are rewritten. That's as simple as --tag-name-filter cat. It will not change the content of the tags, but it will ensure they're moved to the rewritten commits.

Finally, you'll want --prune-empty to remove any now empty commits that didn't involve that file. There will be a lot of them.

Here it is all together.

git filter-branch \
    --index-filter 'git rm --cached -qr -- . && git reset -q $GIT_COMMIT -- YOURFILENAME' \
    --tag-name-filter cat
    --prune-empty \
    -- --all
Schwern
  • 153,029
  • 25
  • 195
  • 336
0

Git now recommends using git filter-repo instead (you get a message about it when using filter-branch). Another answer on one of the questions you linked has a long explanation, but here's a short example.

To remove everything except src/README.md and move it to the root:

pip install git-filter-repo
# Must use a fresh clone to avoid losing local history.
git clone --no-local project extracted
cd extracted/
git filter-repo --path src/README.md
git filter-repo --subdirectory-filter src/

We use --path selects the single file and --subdirectory-filter moves the contents of that directory to root. I can't find a way to do this in a single pass, but the second pass is much faster since the first eliminates most of the history.

idbrii
  • 10,975
  • 5
  • 66
  • 107