14

I often switch back and forth between branches. I have a script which pushes the contents of the checkout to a 'running' environment where I can see the code run and test it (it's a web app). This push script uses rsync at its heart and it uses timestamps to detect what files should really be transferred. Because 'git-checkout' seems to set the timestamps on the files to the current time, rsync reports all files are being pushed up, only because timestamps will be updated.

How can I have 'git-checkout' retain timestamps when switching between branches so that rsync will report only content changes?

I do not want to use rsync's checksum argument as it is very slow.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Dale Forester
  • 18,145
  • 10
  • 27
  • 27
  • Are you actually concerned with performance? Or just that the rsync lists only the files that have content changes? – Emil Sit Mar 20 '12 at 15:22
  • @EmilSit I do want rsync to be fast so that it's not getting in my way, and I want the report not to be cluttered with changes that aren't 'real' to me (ie. not real content changes). – Dale Forester Mar 20 '12 at 15:28
  • Is rsync slow for you currently? – Emil Sit Mar 21 '12 at 18:18
  • no, rsync is good, if I try the checksum mode it is slow, though – Dale Forester Mar 21 '12 at 22:42
  • Note that your use case should work better now (January 2015) with Git 2.2.2+. See [my answer below](http://stackoverflow.com/a/28256177/6309) – VonC Jan 31 '15 at 20:28
  • Candidates for the canonical question: *[What's the equivalent of Subversion's "use-commit-times" for Git?](https://stackoverflow.com/questions/1964470/)* (2009) and *[Checking out old files WITH original create/modified timestamps](https://stackoverflow.com/questions/2179722)* (2010). Mercurial has [the Timestamp extension](https://stackoverflow.com/a/7809151) (though that does not help much). – Peter Mortensen Sep 17 '21 at 10:16

3 Answers3

10

The situation (regarding git checkout and timestamps) should be better with Git 2.2.2+ (January 2015).

The timestamp should not move anymore for files which are already up-to-date when doing a git checkout.

See commit c5326bd by Jeff King (peff):

checkout $tree: do not throw away unchanged index entries

When we "git checkout $tree", we:

  • pull paths from $tree into the index, and then
  • check the resulting entries out to the worktree.

Our method for the first step is rather heavy-handed, though; it clobbers the entire existing index entry, even if the content is the same.
This means we lose our stat information, leading checkout_entry to later rewrite the entire file with identical content.

Instead, let's see if we have the identical entry already in the index, in which case we leave it in place. That lets checkout_entry do the right thing.
Our tests cover two interesting cases:

  1. We make sure that a file which has no changes is not rewritten.
  2. We make sure that we do update a file that is unchanged in the index (versus $tree), but has working tree changes.
    We keep the old index entry, and checkout_entry is able to realize that our stat information is out of date.
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
7

The reason that git checkout updates timestamps is that almost all build systems for source code depend on timestamps to determine if targets need to be rebuilt. If git checkout did not update timestamps on files when they are updated, these build systems would not correctly do an incremental build. In fact, git checkout should only update timestamps on files that have changed.

rsync should be efficient in updating time stamps, and not transfer any data if only metadata has changed. You can verify this with the "speedup". You can also ask recent versions of rsync to itemize changes with the -i flag. You can tell rsync not to use timestamps (and only use checksums) by leaving out -a or -t, but that's not recommended by the rsync(1) man page.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Emil Sit
  • 22,894
  • 7
  • 53
  • 75
  • I guess I'll take this as the answer as it explains the behaviour of git. I guess there is no real solution with git to my particular use case. Thanks – Dale Forester Mar 26 '12 at 14:07
  • It would be nice if "git checkout" had an option to not touch timestamps, if build systems are the only reason. For example, I have a script to sort the output of "git status" by last modification date and that is pointless after having switched branches. – blueyed Feb 07 '13 at 10:15
  • 1
    Hi! I have a similar case. And for some reason, git touches all the files. Also, I do not use the `-a` or `-t` flags, but `rsync` seems to think that the files have changed anyway. I cannot use `--size-only` because I need to update the file even if a letter has changed, so _if_ rsync would compare checksums only, or git would preserve original modification time, that would be perfect. Any ideas? – XedinUnknown Nov 11 '13 at 12:04
  • I wrote a helper script to update timestamps based on last commit times: https://gist.github.com/tstone2077/86529356dd120eb0e51f with this, you can run: git checkout && git settimes – tstone2077 Feb 13 '15 at 03:13
  • "In fact, git checkout should only update timestamps on files that have changed." I don't quite know what you mean here, but just to expand on the OP's question: if I have a file foo.c on a branch, I then switch to another branch which also contains foo.c, the last modified time of foo.c is changed to the current time. A minute later I switch back to the original branch, foo.c is again updated to the current time. Nothing has changed in either branch so why the updates? It becomes impossible to keep track of when files where last changed – Motorhead Aug 29 '22 at 23:45
1

It seems that the only reason Git do it this way is that in a DVCS environment, you may use an old timestamp file to update a new timestamp file and cause a build problem. I don't think this is good, because rarely do we use an old timestamp file to update a new timestamp file.

  1. I think this can be handled in a smart way: when update file A with file B,

    file_timestamp = (timestamp(B) > timestamp(A)) ? timestamp(B) : timestamp(CURRENT);

  2. or, this could be designed as a configurable option.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
jerry.lee
  • 21
  • 3
  • That behavior would break most build systems, because a dependent file might have a timestamp later than both timestamp1 and timestamp2, which means it would not be rebuilt--this would break the build. – Dietrich Epp May 24 '13 at 01:14