6

With the latest Debian version of git (I'm using 1.7.2.5), I've noticed that a .git/index file may change mysteriously, without my having performed any operation that I feel should change the repository. (My shell occasionally runs git branch so it can display what branch is checked out, but that shouldn't change anything.) The change results in a .git/index file with the same length as the original, but containing different bits. What causes this change, and how can I stop it?

(The change is inconvenient because it messes things up for the Unison file synchronizer.)

Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
  • 3
    The index is updated with stat information every time a file in the working directory is inspected by git. It makes subsequent operations faster because git can skip inspecting the contents of a file in many cases if it hasn't been updated since the last git operation. Can't you exclude your .git directory from file synchronization? – CB Bailey Aug 25 '12 at 22:51
  • Why are you using `unison` to synchronize your git repo? That's kind of strange. Just use `git` directly. – Lily Ballard Aug 25 '12 at 22:55
  • 2
    @KevinBallard I'm using unison to synchronize a home directory containing hundreds of thousands of files and many gigabytes, as well as several dozen git repos. Most of these git repos have no other replicas. – Norman Ramsey Aug 25 '12 at 23:13
  • @CharlesBailey that's an answer. Can you make it so? I'll upvote. (I never understand why people like to answer in comments.) – Norman Ramsey Aug 25 '12 at 23:13
  • Incidentally, are you sure that `git branch` is to blame? I couldn't get a query version of `git branch` to update the index. In a script you might want to consider parsing `symbolic-ref HEAD` instead to get the current branch. – CB Bailey Aug 25 '12 at 23:13
  • @CharlesBailey I have no idea what's to blame. – Norman Ramsey Aug 25 '12 at 23:14
  • @NormanRamsey: It's **not** an answer. You asked how to stop a mysterious update to the index. I don't even know what's doing it let alone whether it's feasible to prevent it. I am unable to answer your question so I haven't provided an answer. – CB Bailey Aug 25 '12 at 23:15
  • I came here after observing a similar change to .git/index after editing a file with emacs but performing no git operation directly. the change was detected by an rsync -n operation that I was experimenting with in preparation for unrelated work. I speculate that emacs invoked some git operation upon accessing a .c file because I know of no other operation that occurred in the FS branch. But I don't know the precise chain of causality. I hope this comment helps someone. – Bill Michaelson Jan 11 '18 at 16:31
  • The culprit turned out to be emacs VC mode: https://emacs.stackexchange.com/questions/38418/could-magit-be-writing-git-index-without-my-intervention – Norman Ramsey Mar 25 '18 at 13:26

6 Answers6

3

The index file shouldn't just randomly change. That is the staging tree, a buffer between the repository of commits and the working tree. For efficiency, it also stores some metadata about the working tree (the checked out files which you can modify), which would allow faster status or diff results. To see what kind of such information is stored, try to execute git ls-files --debug. This should print, for each file and directory, something like:

path/to/file
  ctime: 1332898839:873326227
  mtime: 1332898839:873326227
  dev: 2052     ino: 4356685
  uid: 1000     gid: 100
  size: 3065    flags: 6c

So, if a file changes in any way on the disk, not as its content, but internal stuff like which inode it's using, it will trigger an update to the index file next time the index is used.

git branch doesn't update the index, since it only checks the .git/HEAD file and the .git/refs/heads and .git/packed-refs files, it doesn't care about the index or the working tree. git diff and git status, on the other hand, do work with the index.

I did an experiment: I copied the current index file, I created a new version of a file making sure that a new inode will be assigned to it (copy, remove original, rename the copy back to the original name), executed git status, and then compared the new index file with the original copy. Two things changed: a line that contained the affected file in it, and the changes were in the bytes right before the filename, and a few bytes right at the end of the index file, probably a timestamp for the last index computation. The overall size of the file remained the same.

Back to your problem, if you're not executing any command that touches the index yourself, then maybe you have another tool that does that for you: an IDE plugin or a file browser extension that knows about git repositories, and which checks the status of git repositories. Or, there's another process that changes the way files are stored on disk, like a disk-defrag utility.

Sergiu Dumitriu
  • 11,455
  • 3
  • 39
  • 62
  • 2
    Git's index (the logical contents) may not change when you do (e.g.) `git status` but the index file itself (`.git/index`) often will and this is what the question is really asking about. – CB Bailey Aug 26 '12 at 07:18
  • Why would it change? When you say that _the index file itself often will [change]_, do you say that based on knowledge, or it's just an assumption based on the symptoms reported by the OP? – Sergiu Dumitriu Aug 26 '12 at 07:29
  • 1
    git will update the cached information in `.git/index` if it's performing an operation that inspects the working tree and finds the cached information is out of date. This might happen with `git status`, `git diff`, `git grep`, etc. – CB Bailey Aug 26 '12 at 07:33
  • Indeed, I dug deeper and found out that you're right; updated the answer accordingly. – Sergiu Dumitriu Aug 26 '12 at 08:11
  • This stuff is really interesting. I recently upgraded to Emacs 23 and I wonder if it's the culprit. I will try to duplicate your experiment. – Norman Ramsey Aug 28 '12 at 03:05
  • 1
    So what's the conclusion? Is Emacs the culprit? – Sergiu Dumitriu Sep 03 '12 at 02:11
3

I've come across this issue as well, and I believe it's the interaction between unison and git that is causing the problem. When unison synchronizes the two directories, it doesn't synchronize the ctimes. That means that in one copy of the git repository, say copy 2, the file ctimes don't match the times stored in .git/index. That means that .git/index in copy 2 will get updated the next time you run a git command that stats files. When unison runs, .git/index is copied to copy 1, but then its contents don't match the ctimes there. So the next time a git command is run there, the index is updated. Then unison copies it to copy 2, etc.

I haven't found a reasonable workaround for this. Setting core.trustctime=false doesn't help.

To the extent that .git/index is a cache, it should be omitted from synchronization by unison. But I believe that .git/index is also used to stage files, and one might start that process on one machine and finish it on another, which would require .git/index to be synchronized.

(I know some people think it's odd to synchronize git repos with unison, but the point of unison is that you can switch between working on two different machines and continue exactly where you left off. It's an amazing tool!)

Dan Christensen
  • 1,197
  • 1
  • 9
  • 12
  • I don't think it's that it "doesn't synchronize" the ctimes and mtimes, I suspect it's that it *does* synchronize the mtimes. And the mtime on one machine is not going to be the mtime on another. (Plus, Unison itself probably jacks up the ctimes and mtimes in your workdir.) – Edward Thomson Feb 13 '14 at 16:38
  • 1
    @EdwardThomson: Unison does synchronize the mtimes (not to the present, but to match the time in the other copy), but that's not enough to keep git happy, as it updates .git/index even if only the ctimes change. Unison does not itself change the ctimes unless it changes something else about a file, but it doesn't (and really can't) *synchronize* the ctimes. So the loop above happens without *any* changes to ctimes and mtimes of the files in the git repo. It happens because the .git/index file keeps getting switched between storing the ctimes from one copy and the ctimes from the other copy. – Dan Christensen Feb 18 '14 at 21:31
  • Ah yes, that does make sense doesn't it. I would think that this suggests that you shouldn't use Unison to synchronize a non-bare repository, but that's just my opinion. – Edward Thomson Feb 18 '14 at 21:48
0

This will probably not be the solution for the author of this question, but in my case the daily autocommit feature of etckeeper was the culprit.

jgosmann
  • 750
  • 9
  • 19
0

I see the same issue on a setup where I have Unison syncing my home dir (containing 3 git repos) between 2 machines, as well as a cron job that cd's into each repo directory and runs a 'git status' everyday (and emails me if changes are not checked in). My testing indicates it's caused by the fact .git/index stores machine-specific data like the inode number of files[1].

To test this take a repo that is already synchronised and identical on the 2 machines. Copy the .git/index from one machine to the other, e.g. scp -p machineB:/home/me/myrepo/.git/index /home/me/myrepo/.git/index

Now compare the two files and you should see they're identical: sha1sum /home/me/myrepo/.git/index ssh machineB "sha1sum /home/me/myrepo/.git/index"

Now run: git status

Now compare the 2 files again and you'll find they've changed: sha1sum /home/me/myrepo/.git/index ssh machineB "sha1sum /home/me/myrepo/.git/index"

I don't see a solution to this, since you can't use git without running commands like git status which update the index.

[1] https://github.com/git/git/blob/867b1c1bf68363bcfd17667d6d4b9031fa6a1300/Documentation/technical/index-format.txt#L38

happyskeptic
  • 123
  • 1
  • 6
0

The culprit turned out to be Emacs VC mode: https://emacs.stackexchange.com/questions/38418/could-magit-be-writing-git-index-without-my-intervention

In order to make this text an answer, not a comment, I have to say more. So the correct answer is reproduced here:

Emacs VC uses timers to periodically refresh some information and calls git commands to do so and some of those touch the index.

Provided VC was the cause of this issue, then deleting Git from vc-handled-backends would likely fix it.

Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
0

Disabling VC in Emacs does not really solve this problem. It only prevents Emacs to run git status by itself, but running it manually will still modify the file .git/index and lead to spurious modifications/conflicts with Unison.

The git mailing list suggests a workaround [1] that works for me:

  • Enable mtime syncing in Unison (times = true)
  • git config core.trustctime false
  • git config core.checkstat minimal

(The git-config options may be set globally of course.)

With these settings, git now only looks at the integral part of the mtime and at the file size when checking whether a file has been modified, and both are synchronized by Unison.

[1] https://marc.info/?l=git&m=157937653401027