1

As far as I understand how git status operation works, and please correct me if I am wrong, a modified file is detected when its sha-1 hash doesn't match any file name stored inside .git/objects directory.

However, I have a hard time to imagine git reads every already existing file contained in the working tree in order to build and compare its hash because this seems to be a slow process.

Are there any optimisation techniques that I am not aware of? Such as caching?

Thanks :)

Maxime Helen
  • 1,382
  • 7
  • 16
  • 1
    *a modified file is detected when its sha-1 hash doesn't match any file name stored inside .git/objects directory* This isn't quite right. A file is modified in the index, with respect to the `HEAD` commit, if and only if the two hashes do not match. A file is modified in the work-tree, with respect to the index, if and only if `git add` would write to the repository a new or existing blob whose hash differs from that in the index now. The cache aspect of the index (see linked duplicate) avoids needing to compute the work-tree-file hash in most cases. – torek Oct 05 '17 at 23:25
  • 1
    The most important takeaway from this comment is: files are modified, or not-modified, **with respect to** some other file. It's not a standalone thing: you must pick some other file to compare-to. Moreover, when dealing with `git status`, there are **two** (not one) modified/unmodified status-es per file: HEAD-vs-index, and index-vs-work-tree. – torek Oct 05 '17 at 23:26
  • Thanks this is more clear now. I have still an interrogation when you say _The cache aspect of the index (see linked duplicate) avoids needing to compute the work-tree-file hash in most cases._, in which cases does the cache appear not trustful enough for comparing? – Maxime Helen Oct 09 '17 at 08:06
  • 1
    See https://www.kernel.org/pub/software/scm/git/docs/technical/racy-git.txt – torek Oct 09 '17 at 14:26

0 Answers0