This is a little tricky until you realize that Git isn't actually about files. Git is about commits.
Commits store files. Inside each commit, Git stores a full and complete copy of all of your files. That's not all of a commit—the files are its data, and a commit holds metadata as well, such as information about who made the commit (name and email address), when (date-and-time-stamp), and why (the log message they wrote when they made the commit). There's even more to it than that, but we won't get into that in this answer.
The thing to remember is this: Git stores commits, and each one of those commits stores every file. Once you—or anyone—make a commit, nothing about that commit can ever be changed. So as long as you still have any one particular commit you made, you still have all of the files, as of the way they were when you made that one commit. All commits are always, 100%, totally read-only. Most commits remain in your repository forever.
(It's a bit hard, but definitely not impossible, to get rid of a commit. Don't worry too much about "bad" commits taking up a lot of space. The frozen-for-all-time snapshots of every file are compressed—often highly compressed—and thus usually take far less space than the original files, and, since they're frozen forever, Git can and does re-use them automatically. That is, suppose you make 100 commits in a row, changing only one file each time. This can be the same file each time, or a different one each time, and you can change it back and forth if you like—it doesn't really matter. Let's say there are 500 files in each of these 100 commits. Since only one file is different in your next commit, it re-uses 499 files from the previous commit! And, if you change a file back, the new file is the same as the old copy in an old commit, and Git automatically just re-uses the old copy.)
Each commit has a big ugly hash ID that is unique to that one particular commit. The magic of this hash ID is that it is not only unique to that commit, but also, every Git in the universe will agree that that commit gets that hash ID. No other commit anywhere gets that ID.1 So it's easy for two Gits to tell whether they have the same commits or not: they just look at each other's hash IDs. (Actually getting the commit requires getting any files they have that you don't, too, but it's quick to check whether you already have it.)
With that in mind, here's the answer:
First, we should note this: git pull
means run git fetch
, then—if that works—run a second Git command, typically git merge
.
The git fetch
command has your Git call up someone else's Git, and ask them for any commits they have, that you don't have, that your Git should get. They give you those commits, and now you have your commits plus their commits. This does nothing to any of your commits, so it's always completely safe.
The git merge
command is not quite as safe—but it's pretty safe! It first checks to make sure that you've committed your work. If not, it complains and makes you commit it ... and once you have, all of your files, as of the state they have when you commit them, are safely frozen for all time in your new commit.
The git merge
operation works by finding changes. You and your friend / co-worker started with some common commit—some commit with the same hash ID, that you and they got by sharing, earlier. Then you made some changes to some files and made a new snapshot that saved your changes—in the form of the new whole files—forever, and they also made some changes and made a new snapshot that saved their changes forever. Merge compares the merge base commit—the one you both started with—to your own latest commit, and separately, to his latest commit.
Comparing any two commits tells you what is different in those two commits. So now Git knows what you changed, and also what they changed. The merge operation combines these changes, and applies the combined changes to the snapshot from the base version.
If you and they changed the same lines of the same files, you will get what Git calls a merge conflict. In this case, Git leaves you with a mess, and you must clean it up manually, perhaps with the help of a merge tool. But if Git thinks it is able to combine your changes on its own, Git will go ahead and commit the resulting combined-changes snapshot, as a merge commit. This merge commit remembers the hash IDs of both your own immediately-preceding commit and their last commit, so that Git knows which two commits went into performing the merge, and can show you both "legs" of the history that result from this merge.
Some people might advise you to use git stash
before git pull
, rather than making your own commit. You can do this, but I don't recommend it. The reason is simple enough: all git stash push
2 does is make some commits that are not on any branch. This makes the git pull
easier, because now you don't have a commit to merge when git pull
does its fetch-and-merge step—it merges using the commits on the branch—but afterward you'll need to use git stash apply
or git stash pop
to undo the stash.
You might forget to do this right away, and start by making some changes that you have not committed, then remember the git stash apply
/ git stash pop
step. This operation uses the commits that git stash push
made to run a much less safe version of git merge
. If things go badly here, it can be very hard to recover the stuff you changed but didn't commit. So by using git stash
, you're just putting off the merge that you'll need to do anyway.
Once you know lots about merging and other Git operations, you can use git stash push
, git stash apply
, and git stash drop
more safely, but while you are a Git newbie, I recommend avoiding it.
1Technically, two different Git repositories can re-use a hash ID as long as you never introduce those two Git repositories to each other. For more about this, see How does the newly found SHA-1 collision affect Git?
2What is now spelled git stash push
was spelled git stash save
in older versions of Git. The new spelling, push
, fixes a bunch of minor issues with options and allows some new features, and also helps explain why pop
is sort of the counterpart of push
. The problem I have with git stash pop
—well, aside from my overall objections to using git stash
very much in the first place —is that Git will drop the stash if Git thinks the apply step went well. Sometimes the apply step goes badly even though Git thinks it was fine, or you've applied the wrong stash, and it would be nice to not have dropped the stash.
(If you make a lot of stashes, you'll find that sometimes they are very hard to tell apart. After a while you don't know which ones you are keeping for what reason. Admittedly commits can have the same problem, but at least here, you have more of a chance of figuring things out again later.)