Why does git's design distinguish between "ignored" and "untracked" files?

Question

I understand the difference between ignored and untracked files, at least operationally, in git's standard workflow.

What I have a hard time understanding is why the designers of git felt that this was an important distinction to make.

IOW, why didn't they just go for automatically tracking every file that is not excluded by a .gitignore or .git/info/exclude or global excludesfile rule?

Just to be clear: I'm not criticizing git's design. I'm sure there's a very good reason for this ignored/untracked distinction. I'd just like to understand the rationale behind it, from the point of view of design.

EDIT: Let me put it this way. Suppose there was a tool out there, let's call it twit, that was identical to git in every respect except that it had no concept of "untracked": a file could either be ignored or tracked. Can someone describe a scenario that would clearly show git's superiority over twit?

EDIT2: I realize now, in retrospect, that implicit in my question is the assumption that a "good reason" is also a "readily understandable" on. This assumption, however, doesn't hold up. It is possible that the shortcomings of twit could be perceived only after working with it for some time, and that these shortcomings would then lead the users of twit to improve into something that in the end looks like git.

[This SO article](http://stackoverflow.com/questions/21991065/difference-between-git-ignore-and-untrack) discusses a bit somewhat related to your question. — Tim Biegeleisen, Jan 30 '16 at 06:16
The way I think of it is that you have to explicitly add changes to an existing file to the index, so why wouldn't you need to explicitly add a new file to the index? — mzulch, Jan 30 '16 at 06:17
This is not exactly the same issue, but for what it's worth, git's `.gitignore` also fails to distinguish between "files that git should ignore because they are unimportant" (such as cached data that is rebuilt as needed, e.g., compiled *.pyc python files) and "files that git should ignore because they are precious but not distributable" (e.g., files containing sensitive data, whether plaintext or encrypted, such as passwords). Adding this turns out to be nontrivial (I've looked at it a bit several times now). — torek, Jan 30 '16 at 07:49

score 6 · Accepted Answer · edited May 23 '17 at 12:01

Because not ignoring a file does not means "tracking everything":

you can add and commit incrementally, in order to not "track automatically a thousand files, just because they are not ignored
you can discover set of files that are not yet ignored but that you should (if they had been "automatically tracked", that would be to late: git rm first. Or worse, it was already committed and pushed, and you discover it had sensitive information in them)

Anything "automatic" is a dubious idea, as the evolution of your working tree is quite dynamic, and the process of tracking it is an incremental one led by reflexion, not an "automatic" one led by a tool.

Suppose there was a tool out there, let's call it twit, that was identical to git in every respect except that it had no concept of "untracked"

tracked, in git, would implied "automatically indexed (or staged)", as in "ready to be committed".
And the all idea behind git index is to allow a developer to incrementally track files, or, for a file, to incrementally track its content (git add --patch)
All that would go out of the window with "everything tracked by default".

See more with "Why staging directory is also called Index/Git Index?".

Note that there has been thought before to "re-invent the git interface", with everything staged/tracked by default.

But, as demonstrated in the 2010 article "You could have invented git (and maybe you already have!)", the original goal behind git is to merge patches. When you add multiple patches to your working tree, you don't want to blindly track everything, but to slowly add or track what you need to validate, patch after patch.

"tracked, in git, means ready to be committed." That sounds to me closer to what I'd call "staged" than what I call "tracked". Granted, the distinction between these two adjectives is blurred by the fact that one uses `git add` both to tell `git` to track a file, *and* to tell `git` which among the tracked files that have changed relative to `HEAD` should be staged for the next commit. In fact, it was while thinking about this double-duty of `git add` that I realized that I really did not understand the rationale for the "ignored"/"untracked" distinction. — kjo, Jan 30 '16 at 06:38

Why does git's design distinguish between "ignored" and "untracked" files?

1 Answers1