The phrase preserve commit history in Git is nonsense.
The reason is that commits are history; history is nothing but commits. You either have the commits, so everything is preserved, or you don't, so it's not.
What people usually mean by this is: I renamed some file, and now I can't find it. That's not surprising, because each Git commit is just a snapshot of all files. Commit A
has files README.txt
and starter.py
, and commit Z
at the end has README.rst
and alldone.py
. If README.rst
was a rename (and perhaps modify as well) somewhere along the way, well, the only way to find that is for Git to walk, one commit at a time, from Z
back to Y
back to X
until, at some point—say between M
and N
—comparing the contents of the two commits shows that, wha-hey, README.txt
in M
is an awful lot like README.rst
in N
, so let's call that a rename and stop looking for README.rst
and start looking for README.txt
instead. That's what git log --follow
does.
If you jump straight from Z
to A
, the contents of the two files may differ too much to match them up. But that's OK as far as Git is concerned: if you ask how do I edit the files in A
to make them look like the files in Z
, Git will say, remove README.txt
and create a new README.rst
with these contents and those instructions work. They don't tell you want you wanted to know, but they're good enough, as far as Git is concerned.
When you move functions from one file to another, some parts of Git, including git blame
, can do commit-by-commit comparisons, search all the files in the earlier commit, and find this (for git blame
you need the -C
option, vs the --follow
one for git log
). Other parts of Git, including diffing an early commit directly against a late commit, often can't: when comparing any pair of commits, the renamed file must be sufficiently similar to the original file for the -M
/ --find-renames
option to work. You can adjust the rename-finding threshold: -M
, when turned on without any threshold, uses a 50% similarity index, i.e., about half of the file must be the same in the two commits for Git to call that a rename operation. But this rename detection requires several other conditions to hold as well. Typically, moving a function from one existing file to another existing file will cause it to fail. For git diff
you can sometimes use -B
as well (the break-pairings flag, which takes up to two similarity index values), but the usefulness of this diminishes pretty rapidly.