Your question is ambiguous at best, and contains some bad assumptions, so this answer is long.
Some background about Git commits and git init
All commits in Git always contain all files. That's how Git itself works.
Running git init
will either:
- create a new, empty Git repository in the current working directory, or
- re-initialize the existing Git repository wherever it is.
You get the second behavior—re-initializing the existing Git repository—if Git sees that you are in some existing Git repository. The output of git init
tells you which one it did:
$ git init
Initialized empty Git repository in [path, redacted]
$ git init
Reinitialized existing Git repository in [path, redacted]
Except for some special cases that almost certainly don't apply to how you're using Git, the "reinitialization" variant doesn't really do anything at all: your existing repository remains unchanged.
When git init
creates a new, totally-empty repository, there are no commits and therefore no branches yet. The next commit you make is thus the first commit ever. This commit is a bit special: it is a root commit, with no history. It contains whatever files you tell Git to have it contain, using git add
.
After this point, though, you have an existing Git repository with existing commits. This includes the case where you use git clone
to copy some existing repository (e.g., from GitHub) to a new Git repository on your own machine (e.g., your laptop). You will tell Git to check out some particular commit—usually, the tip commit of some branch name—which means Git will fill in both its staging area and your working tree with all the files from that commit.
Subsequently, you'll edit some files and maybe even create some new ones. You then run git add
on one or more of these files. If you're git add
-ing a file that already exists in Git's staging area, Git tosses out the old copy from its staging area and overwrites the staging area copy with a new copy made from your working tree. Or, if you git add
a totally new file, Git copies the file into its staging area, as a new file.
In all of these cases, all the existing files in the staging area remain there. Your next git commit
takes all the files that are in Git's staging area, and makes a snapshot from them.
A concrete example
Suppose you have an existing repository where the main branch (whatever its name is: GitHub now encourage people to use main
while older repositories tend to use master
) has ten files in its most recent commit. You git clone
this repository to your laptop, so your laptop Git software ("your Git") checks out this last commit, extracting the ten files into Git's staging area and your working tree.
You now change five of the ten files in your working tree, but run git add
on only one of the five updated files. This means that your Git's staging area has ten files in it: nine files match the one from the current commit and one matches the updated file in your working tree. Four staging-area files differ from their four working-tree counterparts; the remaining six staging-area files match their working-tree counterparts.
If you now run git commit -m haaaaaands
, you get a new commit containing the ten files exactly as they appear in the staging area right now. You still have all the updated working-tree files in your working tree, but the staging-area copies still match the previous commit's copies, so the new commit's copies match the older commit's copies, except for the one file on which you ran git add
.
The new commit you just made becomes the current commit, which is now the most recent commit in your laptop's repository on the current branch. You can now use git push
to send this commit to the GitHub repository; if and when you eventually do that, the commit they receive will match, bit-for-bit, the commit your Git stored in your laptop repository. It will have the 9-files-that-match-one-file-that-doesn't situation; the commit they get will have the previous commit as its parent; and so on.
Things to know about git status
First, git status
tells you things about your current branch. It will say something like on branch main
. This is your Git telling you that your laptop repository has main
as the current branch. Your Git may also tell you that you are "ahead" and/or "behind" some other name, such as origin/main
: this uses information stored entirely locally, on your laptop. This information may be out of date, depending on how active the other Git repository, over on GitHub or wherever it is, may be.
Next, if you're not in the middle of a conflicted merge—if you are, the rest gets more complicated—the git status
command runs two comparisons:
First, it compares the files in the current commit to the files in the staging area. Some of these files will usually match exactly, since you didn't do anything with them since the time they were extracted from some commit. For those files, your Git says nothing at all.
Other files in the staging area won't match your current commit, because you ran git add
on them for instance. In this case, your Git will say that these files are staged for commit. That simply means that the staging area copy differs from the current commit's copy in some way.
Note that some files in the staging area may be new. That is, those files do not exist at all in the current commit. For these files, Git will say that these are "new files".
Having listed files "staged for commit", or not found any files to list, your Git now goes on to compare the files in the staging area to the files in your working tree. As before, some files may match. Other files might be different—and there might even be files in your working tree that have no counterpart at all in the staging area: files that are new, as before.
This time, though, your Git will only tell you about changed files, saying that such files are not staged for commit. It does collect up a list of each of the new files as well, but holds off on them for until the next part.
Having listed any files "not staged for commit", your Git goes on to tell you about untracked files. These are any files in your working tree that aren't in Git's staging area. In other words, these are "new" files.
The thing that's weird about these is how they're separated out, into "untracked", as a separate category. The reason for this is that the Git authors expect a very large number of untracked files that should not be reported here. Git in particular is built to work with compilers that create "object files" and other "build artifacts" that, while they may be important, should not be added to commits and thus saved forever.1
To this extent, Git has an exclusion facility, via .gitignore
and other exclusion files. Here, you list files that Git should just shut the ____ up about. It should not complain that these untracked files are untracked. Moreover, when these files are untracked, you can use an en-masse git add
operation, such as git add .
, to add all untracked files ... except for those marked "ignore".
What's misleading about .gitignore
is that it will not ignore any file that is tracked. The word tracked here is defined in terms of the opposition of the definition of untracked. An untracked file is a file that exists in your working tree, but not in Git's index. A tracked file is one that is in Git's index, whether or not it exists in Git's index. A tracked file is never ignored.
Good maintenance of .gitignore
files makes Git much more pleasant to use: git status
tells you only useful things; git add .
adds only the correct things.
1The reason for this is that the build artifacts are—at least, ideally—completely reproducible from the original sources. We want to save only the originals, not the derived work-products. That saves—at least potentially—enormous amounts of space and time and human work later. Note that there is a lot of "ideal" and "potential" here. These things don't always work out as planned, and sometimes it's actually reasonable to save everything ever. Git isn't so great at that, though, so you probably don't want to use Git for that purpose.
Possible sources for "all files always committed"
If you run git add .
, you are telling Git: scan my current working directory, find all updated files and all new files and any removed files, and use git add
on each one to update your staging area copies. The only exceptions here are files listed in .gitignore
or other exclusion files, that are not already tracked.
If you run git add *
, the behavior depends somewhat on your command line interpreter: Unix-style CLIs (such as bash or zsh) have the shell expand the *
, while MS-DOS style CLIs (such as CMD.EXE) pass the literal asterisk *
to Git, which then expands the *
. I won't go into all the details of the difference here, but this tends to do an en-masse add of a lot, or all, files, depending on the many details.
If you run git add -u
, you tell Git to find updated files and add them.
You can have a pre-commit hook. Hooks in Git are rather complicated, but some software installers will not only install Git for you, but also set up some sort of automatic hook creation. (This is the kind of setup where the reinitialization of a Git repository can have an effect, although for it to do so, the installer has to put those hooks into a Git "template", which seems to be used rarely if ever.) A pre-commit hook can, depending on how you run git commit
, run git add
for you, even if you don't want it to.
If you run git commit -a
, you are in effect telling Git to run:
git add -u
git commit
There's an interaction here with pre-commit hooks, so the two-command sequence is not exactly the same, but this could be the source of your problem.