3

How can I remove the first N commits and keep the rest of commits untouched?

In other words. I want to split the repo at the point n-th commit, pick up the commits start from n+1-th, discard the commits from 1st to n-th.

I do not know how to achieve this. Only tried git rebase, but it did not work.

The following is what I did with git rebase.

For example, the repo for testing is https://github.com/danistefanovic/build-your-own-x.git

  1. Run git rebase -i --root

  2. Mark the first 20 commits as deleted

    d 4b6d12e Initial commit
    d f81138e Add links to issue form
    d 3530d91 Update README.md
    d 506b834 Update README.md
    d 79f41aa Update README.md
    d f9f9113 Update README.md
    d 9c8f134 Add resource type
    d 133469b Update ISSUE_TEMPLATE
    d 3a47f54 Add tutorials #1
    d 0271577 Add tutorial #2
    d 525953b Add tutorial #3
    d 2796fe5 Update ISSUE_TEMPLATE
    d 4f58667 Updated wrong link language from Go to Node.js
    d 0c525a7 Add tutorial #5
    d 0484fca Add OS: Build a minimal multi-tasking kernel for ARM
    d 457d5bb Add tutorial #9
    d 41ba7fc Add tutorial
    d dfbbf0f Add tutorial #11
    d b76dad5 Add tutorial #12
    d cd3bd26 Add tutorial #13
    pick c5f6c94 readme: Add my tutorial on shell
    pick 1f3c285 Add a minimal interpreter, compiler (x86/Arm) and JIT compiler
    ...
    
  3. No luck. I have to handle conflicts manually as show blow.

    $ git rebase -i --root 
    error: could not apply c5f6c94... readme: Add my tutorial on shell
    
    Resolve all conflicts manually, mark them as resolved with
    "git add/rm <conflicted_files>", then run "git rebase --continue".
    You can instead skip this commit: run "git rebase --skip".
    To abort and get back to the state before "git rebase", run "git rebase --abort".
    Could not apply c5f6c94859d852797c26d815e438e6a697c137a9... readme: Add my tutorial on shell
    

Update

Duplicated with Remove / cut off Git's revision / commit history

But did not find a way to keep the timestamps.

jungle
  • 43
  • 5

1 Answers1

3

First of all, are you trying to remove the first 20 commits, or the 20 commits after the first one? As you might notice in the TODO list you posted, commit 4b6d12e isn't listed so even if the operation works it wouldn't be affected. This is because you specified 4b6d12e as the upstream. If you wanted to include it (and if it has no parent, since you said you're trying to remove the first 20 commits), you would use the --root option so that you don't have to specify a commit as upstream.

Secondly, how do you define "remove a commit"? If commit A adds file1, and commit B adds file2, and I "remove" commit A, so that I'm left only with commit B' - should B' contain both file1 and file2, or file2 only?

If commit B' should contain file2 only - meaning that you're removing commit A and the changes it introduced - then interactive rebase with the d option is a way to do it. But of course it's going to cause conflicts that you have to manually resolve; of each of the commits you keep, any that modify a file that now no longer got created (because it originally had been created by a commit you deleted) is a conflict. There's no way for git to infer what you want it to do, so you have to resolve the conflict manually. The way your question trails off suggests you expect the process to be automatic, but if you're deleting the founding commits of the repo along with their changes, that can't be automatic.

Note that this is a history rewrite. If the repo is shared with others and the branch has even been pushed (at a time where it contained one of these first 20 commits), then

(a) you'd have to use push -f to update the remote, and

(b) having updated the remote, you'll have put all other users in a broken state, from which an incorrect repair procedure on their part would undo your changes; so you have to coordinate with everyone else who would be affected if you want this to work.

On the other hand, if B' should contain both files - in which case it should be called something like AB instead of B' - then you want to do one of two things:

One option is to squash the commits instead of delete them[1]. This is just a matter of using a different command for each commit in the rebase -i TODO list. If all you do is squash commits, then there shouldn't be any conflicts; but this is still a history rewrite, so the above notes on that still apply.

The other option is to create a shallow clone of the repo. This is not a history rewrite; it's major advantage is that it preserves the commit identities so nobody's repo is broken (and as a side effect, you could re-associate the removed history in the future if the need were ever to arise). See the depth and shallow-* options of git commit. https://git-scm.com/docs/git-clone

Being shallow isn't something you can really share through push and pull, though. If a remote needs the commits removed, you'd have to replace that remote with a shallow (and presumably mirrored) clone. It would be up to each other user with a clone to update said clone[2].


[1] I don't actually like this terminology of the rebase command, because it promotes the misconception that a commit is defined by its patch, which at a physical level is entirely wrong. But, that is how rebase uses the words.

[2] That last point also highlights a possible side issue: No matter what approach you choose, you can't force anyone else to discard those commits from their clones. So if you're removing them because, say, they contain sensitive data, then probably the cat's out of the bag. You should treat any credentials that have ever been pushed to a shared repo as compromised, for example.

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
  • Thanks for your comment. I want AB files. I updated the description. And tried `squash`, did not work. – jungle Jun 07 '18 at 13:30
  • @jungle - Well, when you provide sufficient information, perhaps I (or someone else) will troubleshoot your attempt to use squash. As of now we don't know what exact command you issued, or what state your repo was in when you issued it, or what output the command generated, or in what way it "didn't work". I have used this technique *many* times, as have many other users; so a dismissive "it didn't work" means nothing. – Mark Adelsberger Jun 07 '18 at 13:43