Changes that are obviously separate to a human (who can understand code) are regularly messed up by Git's diff algorithm. For example
def method_that_already_existed(blah)
a line that did not change
a line that was deleted ######## the changed area starts here (per Git)
a new line
end
def a newly_added_method_that_belongs_in_its_own_commit
blah blah blah
blah blah blah
etc. ######## the changed area ends here (per Git)
end
It is obvious to a human that the changes to the first method and the entirely new method are entirely different changes. But Git treats them as one, and DOES NOT ALLOW ME TO SPLIT THEM UNDER ANY CIRCUMSTANCES.
Worse than that, the change (according to Git) goes from the middle of the first method to the just before the end of the second method. This makes it impossible to select just specific lines and commit one of the methods. Those lines that git sees as "context" are impossible to select.
If I use git add -p ./path/to/file
it no longer has the s
option for split in my version of Git (which never worked well anyway) but it has e
for edit, but that will not allow adding the final end
of the second method. So basically Git offers me absolutely no way of selecting the changes intelligently and adding them separately in separate commits.
Likewise in VS Code, I can select line-by-line from the existing lines, but I can't select lines that Git doesn't think of as part of the changed area. (And also I can't differentiate between added lines and removed lines--a change includes the deleted lines invisibly, so if they are actually a part of a different change, I'm out of luck again.)
So there's no way to control this that I can find, unless I change my code just to trick Git into doing the right thing. If I dig into the history to get the line that was deleted in the first method and add it back in, and then remove (temporarily) the line that was added, and save the file, then it will properly recognize what has changed. Of course I have to remember to undo this kludgey solution, and make sure I undo it properly, or I've broken my code. And this is a tedious and really horrible workaround.
I would love it if there was a way to get Git to recognize changes "properly" the way a human would. Until we have AST-based diff algorithms, I'm not expecting this to be available any time soon. So the next best thing would be to have a way to specify what has changed and not leave it up to Git to guess. Is there any way to do that?
For example (this would be just one way to partially solve the problem), if I could tell Git to NEVER EVER EVER EVER let a diff chunk span an empty line, I would solve this particular example. If I have a chunk that I want to span an empty line, I'm happy to add both chunks separately. Git should always treat them as distinct changes. But that's just one example, and not the basic question.
The basic question is:
If Git can't properly recognize what has changed, how can I force it to accept my version of what has changed? (Without resorting to tedious & error-prone kludges like manually undoing some changes by digging into git history to undo one of the changes so it won't erroneously group two separate things together!)