Rewrite a git repository to change the date range of commits

Question

Say, I have a rather large Git repository with 1000+ commits. Commit dates are ranging from August 2013 to now (August 2014). All commits were done by one user (me).

Now, for some reason, I need to make all commits in the repository appear to have taken place between March 2014 and now.

This can be achieved by either changing the existing repository, or by creating a new one and re-commiting all the changes.

If it were for a handful of commits, I would manually check out every revision and commit the state to a new repository using the --date switch as described in the Git documentation.

However, with the number of commits this is impossible.

Why did someone downvote this? It is a perfectly legit question, and it's written pretty good. — Max Yankov, Aug 08 '14 at 07:39
Updated answer using **git filter-repo** https://stackoverflow.com/a/60873857/461597 — Unapiedra, Mar 26 '20 at 18:28

score 2 · Accepted Answer · answered Aug 08 '14 at 09:21

This is the kind of thing git filter-branch does.

With git filter-branch you list commit(s) and branch-names that should be used. As the documentation says (somewhat cryptically):

The command will only rewrite the positive refs mentioned in the command line ...

In your case, this likely means you want --all to cover all the branches, which coincidentally (or not-coincidentally, really) also tells the filter-branch script to look at everything there is to be found in the repository (i.e., all commits, and also all tags / annotated-tags). This is because the --all argument is given to git rev-list, where it lists all commits (and annotated tags).

The filter-branch script works by iterating over each named revision. For those that are commits, it applies all specified (non-tag) filters. The most appropriate one to use here would be the --env-filter.

(For those that are tags, it applies the given tag-name-filter if any. If none is given, it does nothing with tags. For this reason, you probably want --tag-name-filter cat, as documented in the examples. See the documentation for details.)

Once the script applies your filters, it then makes a new commit,¹ with whatever alterations you have made. Your filters are generally² fed to the shell's eval, which allows you to set environment variables. The critical environment variables in this case are the two that control the commit time-stamps: GIT_AUTHOR_DATE and GIT_COMMITTER_DATE.

Your environment filter should begin by extracting the existing dates from the commit, whose ID is given to you in $GIT_COMMIT. If those dates are outside the range to be modified, you can unset the corresponding environment variable, or set it to the original commit's date, so that the existing date-and-time-stamps will be used in the new commit as well. If they are within your "change range", however, you will need to set (and again export) the variables to the desired new values.

You will want/need to refine this (probably a lot, and it's quite untested), but an env filter might look something like this:

--env-filter 'at=$(git log --no-walk --pretty=format:%ai $GIT_COMMIT) \
    ct=$(git log --no-walk --pretty=format:%ci $GIT_COMMIT); \
    export GIT_AUTHOR_DATE=$($HOME/scripts/massage-time $at) \
    GIT_COMMITTER_DATE=$($HOME/scripts/massage-time $ct)'

where $HOME/scripts/massage-time is a script you write to take the time stamp (here, formatted via %ai and %ci; choose your own favorite format) and massage it into your converted range. In fact, your massage script could use the environment variable $GIT_COMMIT directly, and simply produce as output the export GIT_AUTHOR_DATE=... commands (since, again, the output of your provided filter is fed to eval). (For testing purposes, though, it might work best if it takes the commit-ID as an argument. Then you can manually make sure it does the right thing with various sample commits, before using it as an environment filter.)

Once the filter-branch script has finished making all these new commits, it then does the reference-name rewriting to point each ref-name to whichever new copy-commit corresponds to the original one. For instance, if refs/heads/master used to point to commit badface and the copy of badface is deadb17, the script makes refs/heads/master now point to deadb17. This is the way virtually all git commands work: they simply add new stuff to the repository, leaving the old stuff in it as well, and creating or moving reference-labels to point to the new stuff. If and when the old stuff eventually becomes un-referenced, git gc can remove it at that point.

¹It actually runs your commit filter at this point, but supplies a default one that makes a new commit. If you supply your own commit filter, making the commit becomes your responsibility; this allows you to omit some commit(s).

²The eval rule applies to everything but the commit-filter. You can inspect the filter-branch script yourself to see: it's in the git-core directory, often in /usr/local/libexec/git-core or /usr/libexec/git-core depending on git installation.

score -2 · Answer 2 · answered Aug 08 '14 at 07:41

BASH it out :) Note that the following is written from memory and hasn't been tested, so I'll put in commentary to assist in explanation

# Run git logs to get all the commit ids you want
git log --before={2014-03-01} --after={2013-01-01} --author="your name" > filename

# Run grep, so you can isolate the commit ids
grep "commit" filename > commit_ids

# Run a bash loop on those ids and change date
for i in $(cat commit_ids); do echo $i; git commit  --amend --date "`date`" $i; done;

`git commit --amend` will only let you replace the tip-most commit. (Specifically, it does not allow a commit-ID argument.) — torek, Aug 08 '14 at 08:30

Rewrite a git repository to change the date range of commits

2 Answers2