1

I attended a lecture class there i understood that whenever a change is made in a file ,GIT doesn't store the diff rather it stores the new snapshot of the modified file in its new version. I have few questions :

  1. My understanding of the concept is correct or not ?
  2. if yes, then How storing entire file is better than storing diffs between two version of same file ?Is it not a wastage of space ?
  3. Why we need to add before doing a commit on a file, why cant it be done directly? (i know this is slightly off the hook)

please correct me if i misunderstood anything.

Subham Tripathi
  • 2,683
  • 6
  • 41
  • 70
  • 2
    Your best option would be reading a few chapters form the progit book http://git-scm.com/book/en/Getting-Started (exactly covers your questions) – Isantipov Aug 18 '14 at 09:19
  • 2
    The third question has nothing to do with the first two. You should better split it off, or even better, read it up in the book @Isantipov linked, which explains the concepts of git. The first two questions should be answerable here. – Jonas Schäfer Aug 18 '14 at 09:24

2 Answers2

3

Don't mix up revision and storage.

  • revision is, as opposed to other VCS, a complete file system. Git doesn't build back the current revision with the delta from the previous ones: it references everything.
    See Git Basics

http://git-scm.com/figures/18333fig0105-tn.png

Since Git is at its core a content manager, two files with the same content will be actually stored once.

As for the index, it allows you to prepare the next commit, not only with files you want to add, but in some case, with part of files you want to add.
See also:

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
1
  1. Yes, but periodically git will compress older files and store them as pack files that will merge files together and compress them, but you really don't need to worry about this I think.

  2. If you don't change a file, it's not duplicated during a commit.
    So most files only exist once, only modified files are stored as new files. combined with the periodical pack compression this should keep storage in check.

  3. ( somewhat unrelated to the storage format ) Staging changes or adding to the index or simply adding ( all equivalent things ) allows you to select what changes you want to commit. It basically enables you to split up several changes over multiple commits, it's really very useful. ( you can use git add -A to add all changes in 1 command)

More info on pack files:

http://git-scm.com/book/en/Git-Internals-Packfiles

Willem D'Haeseleer
  • 19,661
  • 9
  • 66
  • 99