Considering your asks and the comments and edits you made, it might be useful to pinpoint the fact that Git is a snapshot-oriented VCS. Each commit actually contains a tree object that leads to every file this commit references.
Each time you change one your file, would it be of a single byte, it's assumed as something new then recorded all again. Since all contents are indexed using their SHA1 sum, though, only different contents are saved separately. If you record multiple times the same file or revert to a previous version of it, it will be recorded only once. All this stuff is moreover compressed, so you'll never face any space issue because of this.
In this way, this behavour is introduced as being similar to the filesystem's snapshot mecanism, which makes it acceptable.
This answers your first question : files are always recorded, not changes. What you see when you browse a commit is actually an automatic "diff" operation between this commit and its parent one. This also enables you to easily make a "diff" between two arbitrary revisions without having to resolve anything first. It also guarantees yourself that once you can reach a commit, you'll have access to the entirety of its files, even if you can't see its history (useful with shadow clones or if your repository is corrupted).
If you now want to automatically embed all modified files each time you commit, you can use git add -u
to mark all updated files, or git add -A
to include all new files with updated ones, or even git commit -a
to perform an add -u/commit
in a single operation.
You can also easily define aliases commands, either from outside if you're using a shell, or in the [alias]
section of your gitconfig
file. For example, I personnaly use:
[alias]
root = rev-parse --show-toplevel
… to have a git root
command that finds the root directory of my repository.
But… you probably don't want to do this.
The reason why this is not automatic with Git is to incite the developer to prepare "cooked", unitary commits that focus on one purpose at a time, even if this task applies on multiple files at a time and if, on the other hand, a same file can be amended at different places for different purposes.
That's why, from this point of view, staging all modified files at once is generally pointless because unless you commit very frequently, it's very unlikely that all modified files concern a single topic at a time.
If you really don't care about it and you want is to save the state of your work, it remains easy to do so using the commands stated above but trust me, doing clean commits is AT LEAST as valuable as the code itself. It's really important when you work alone, it becomes critical in a teamwork.
As regards the index now: it's actually a very clever way to handle the whole thing. At first, the index is simply the list of files that are already tracked. It's a flat, binary file located under .git/index
. But it won't stick to hold the names, it will also refer the content objects these files are associated to. This means that objects are created at add
time and when you commit, Git simply needs to record the state of this index.
It's really interesting because this what enables Git to know if a file is unchanged, staged, unstaged or even both of it. Also, when you select hunks with git add -p
, Git won't collect temporary bits of changes stored somewhere: it will directly amend the index, which then allows you to prepare exactly what you want or revert it to its initial state if you change you mind.
Git is not as cryptic as it seems. The only notions you need to master are the object concept, the way the index works and optionaly the reflog to easily recover when something go wrong. In particular, DON'T try to emulate the Mercurial behaviour: it looks easy at first but leads you pretty soon into a dead end.
You may be interested in this post: What is the use of Staging area in git