2

There are times where I make some simple script that eventually turns into bigger code later on, and I decide to start using git to track those changes. Up until now, I've been adding that information in my commit message. For example, it'd show up like "Initial commit (Feb 11)", committed on Mar 8, but I feel like it ends up looking kinda awkward. An example on GitHub

When I looked up questions on how to make a commit in the past, I noticed that some of the comments on those threads said that it's a bad idea to make commits in the past/future. Why is this the case?

Edit: I should clarify that the method I'm planning to use is setting the GIT_COMMITTER_DATE and GIT_AUTHOR_DATE environment variables to a certain point in time before making a commit as opposed to adding a --date flag to git commit. As far as I can tell, GitHub shows each commit in chronological order whereas git shows each commit in commit order. For example, commit A is in 2019, commit B is in 2020, and commit C is in 2018. GitHub would show C A B while git would show A B C (as far as git log goes). I don't think there should be a problem if I just keep things chronological, but I could be wrong. Finally, I only plan on doing this when initializing a repository from existing code.

  • 1
    Yes, your comment looks awkward, but anyone going into your history would probably be using the Git timestamp associated with the commit, rather than the date comment, so I don't see much of an issue here. – Tim Biegeleisen Feb 09 '21 at 04:36
  • 1
    @TimBiegeleisen A commit can have multiple timestamps though: primarily, the "author timestamp" and the "commit timestamp". You'll mostly notice they're different when doing a rebase. – Dai Feb 09 '21 at 04:38
  • It seems this is kind of a meta question about [this one](https://stackoverflow.com/questions/32315156/how-to-inject-a-commit-between-some-two-arbitrary-commits-in-the-past). – Romain Valeri Feb 09 '21 at 07:01

1 Answers1

3
  • First, be aware of the differences between AuthorDate and CommitDate.
    • Ostensibly, AuthorDate is when the commit was originally made and CommitDate normally has the same value as AuthorDate, but if you do a rebase or cherrypicking or other tasks involving rewriting history then the AuthorDate should remain unchanged and the CommitDate stores the timestamp of the rebased commit.
      • If you rebase an already rebased commit, then the original AuthorDate will remain but the now-rebased commits' original CommitDate will be lost (assuming the first rebases' commits were removed after the second rebase).
  • Secondarily, git (the system) does not care about chronological ordering.
    • Commits are strictly ordered only by their parent commit-id - so commits can have wildly varying timestamps that are in no chronological order but provided the commit-id references are valid the repo is in a valid state.
    • In general, the timestamps are for the benefit of the human users, not the computer.

What are the downsides to making a git commit in the past?

It depends on what exactly you mean by "make a commit in the past". You didn't specify that in your question so I'll run through a few different methods:

Method 1: Add a new commit to HEAD with an AuthorDate set to some point-in-time from the past.

This is valid and fine, and is often done when source-code is being migrated from one source-control system (like TFS, SVN, CVS, or Perforce) to git such that history is being preserved, so the AuthorDate will have the original TFS/SVN changeset datetime from August 1998 or something, but the CommitDate will be the migratory commit to git from Feburary 2021. This allows users to see that the code itself is ancient, and also when it was added to git (this can be useful to know if the same source-code is migrated to git on multiple occasions).

(I note that a few projects and teams I've been on had transitioned from SVN to git but they just copied a snapshot of the project filesystem as the initial commit in the new git repo, so no history was preserved - that's really dumb and no-one should do that)

To set the AuthorDate of a commit, use the --date= command-line option for git commit.

So - what are the downsides and upsides to setting an AuthorDate value from the past? * Advantage: clearly expresses the date+time some human wrote some code, regardless of when it was added to git specifically. * Disadvantage: none, besides having to understand that the AuthorDate is not the same as the CommitDate and having to explain that to incredulous new users who don't understand how a git repo can have a commit dated from before git was even created in 2005.

Method 2: Add a new commit to HEAD with a CommitDate set to some point-in-time from the past but AuthorDate is present real-time.

I don't believe there is any reason to do this - while it won't break anything (as you're committing to HEAD and not rewriting history) but it will just confuse other users of your repo (as there's a general expectation that AuthorDate <= CommitDate).

To set the CommitDate to a specific value you need to use the GIT_COMMITTER_DATE environment variable (or change your computer's clock) - the fact you would need to jump through hoops like setting an environment variable that should be a hint that you probably shouldn't be doing this :)

There are no "disadvantages" (or "advantages") to manually setting the CommitDate because it entirely depends on the reason you're setting it - but generally speaking, you shouldn't be setting this value yourself.

Method 3: Rewriting git history to insert a new commit in the past to an existing branch

I stress that it isn't actually possible to "insert" a commit into the commit graph like you can with a linked-list in a C program because commits are identified by their content addressable storage hashes (just like the Bitcoin blockchain!) - so "inserting" a new commit will invalidate all subsequent commits - this is why rebase operations are often painful and will always end up with you having duplicate commits in separate branches until you delete one of them and why you need to git push --force after a rebase if you published to a shared branch... which will then breaks everyone else's local repos when they eventually do a pull.

So - what are the downsides and upsides to rewriting history?

  • The advantages of rewriting history to add a new commit in the past is that it can be less confusing than having chronologically unordered commits, so people don't need to jump around their git commit graph to make sense of changes to the code.
  • But the major disadvantage is that it invalidates other peoples' repos if you rewrite a branch which is being tracked by other users - this will quickly make you hated by everyone on your team :)

W.r.t. to your situation:

From what I interpreted from your question (which is lacking details, unfortunately) I think all that you need is to use the git commit --date={date} option to set the AuthorDate value to some meaningful timestamp that makes sense for your project.

Dai
  • 141,631
  • 28
  • 261
  • 374