0

We are using git for source control.

If I take a look at particular file history (using git log --pretty=format:"%h %ad" --date=short <filename>), I see something like it:

HashOfCommitA 2018-01-15
HashOfCommitB 2018-01-09
HashOfCommitC 2018-01-05
<older commits>

The CommitA and CommitB are merge commits.

In the changes of CommitB (git diff HashOfCommitB <filename>) there are two new lines added to file. In the changes of CommitA those lines aren't affected, but if I examine file content after CommitA merge, then two new lines added in CommitB are missing.

Basically, when I looking at file history, I can see at one point, that something is added, but after next commit, it's missing and I don't see the deletion of this lines in commit changes.

Might it be because merge was made with an older version of a branch (without CommitB)? How can I find where those lines were deleted?

In other words, how is it possible? Are there good ways to prevent such situation in the future?

Roganik
  • 98
  • 1
  • 5
  • 1
    Git can sometimes automatically complete a merge and in the process remove something. I don't see this happen often, but it can happen, and there is no extra commit recording this; it is just part of the merge commit. General advice would be to not have your developers closely overlapping on areas of the same code. Following this advice would also minimize ugly merge conflicts. – Tim Biegeleisen Jan 17 '18 at 06:06

1 Answers1

1

Related: Git; code disappeared after merge. Pay particular attention to badly-behaved tools that make it very easy to keep only one "side" of a merge conflict without thinking. As a very general rule, a good, reliable way to avoid the problem is to have thorough tests that you run on each commit before accepting it.


If I take a look at particular file history ...

Be careful here. Git doesn't have file history; what you are seeing is a faked-up pseudo-history resulting from Git selecting particular commits—the commits are the history, and the complete set of commits is the only history actually inside the repository—that the authors of git log thought would be a good filtered-down selection. When you use this kind of filtering, git log enables, by default, what it calls History Simplification, which is described (rather poorly in my opinion) in this section of the documentation. I find that this can be quite misleading.

Besides that, unless you use --graph, the output from git log is sorted and displayed in a way that makes it difficult or sometimes impossible to tell which commits really happened at which points. There is a basic problem here with showing, in a linear order, that which is fundamentally not linear:

       B--C
      /    \
...--A      F--G   <-- branch-tip
      \    /
       D--E

Here, commit G clearly comes last, so that's the one Git will show you first. Commit F comes next (i.e., just before G) so Git shows you F next. But now Git could show you either C or E. Which one should it pick?

Git's default is to take them by commit time, so if C happened slightly after E, Git will show you C here. Then probably E happened next, in the Git-backwards fashion (i.e., earlier), so Git now shows E; now Git can show B or D. Once it's shown that, it can show the other or A, but the other (of B or D) is probably next, unless the commits were made with the time set wrong, or on another computer with a different idea of the correct time.

Eventually, you do see all of the commits—unless, that is, History Simplification has been removing some, perhaps entire arms of branches. (Even without History Simplification, the order they are displayed in is sometimes hard to predict.) Worse, in the default History Simplification mode, as the (perhaps impenetrable) documentation linked above mentions:

If the commit was a merge, and it was TREESAME to one parent, follow only that parent. (Even if there are several TREESAME parents, follow only one of them.) Otherwise, follow all parents.

In your case, you are getting the default mode, so if the author of the merge picked one "side" of a branch (deliberately not taking changes from the other side), Git prunes the other side entirely. But that's precisely where the changes that you think should have been kept, were dropped, so git log has in effect lied to you!


I will add this as well, although this may only be slightly relevant:

The CommitA and CommitB are merge commits.

(This means that these are commits with at least two parents, probably with exactly two parents.)

In the changes of CommitB (git diff HashOfCommitB <filename>) ...

If you're running that command in exactly that way, you are asking Git to compare what's stored in the merge (the merge result) to what's in the current work-tree. That's the fourth form of git diff in the description section of the git diff documentation. So these are not changes stored in the merge commit.

In fact, though, no actual changes are stored in the merge commit. The merge commit is, in this way, like any other commit: it stores a snapshot, which—in Git's "eyes" anyway—is simply the correct result of the merge (as told to Git by whoever ran git merge, with the proof of correctness being that whoever ran git merge committed the result). That merge has multiple inputs—typically, the (two) parents plus their (single) merge base—and you can compare the final merge snapshot to either of the two parents and hence find changes, but the changes you find with respect to parent #1 are different from the changes you find with respect to parent #2.

torek
  • 448,244
  • 59
  • 642
  • 775
  • 1
    I always see this because somebody has run `git merge -sours` because it seemed like a good idea. – Edward Thomson Jan 17 '18 at 07:59
  • Thanks for the detailed explanation. _As a very general rule, a good, reliable way to avoid the problem is to have thorough tests that you run on each commit before accepting it._ It will definitely help except the case if a test disappears with new lines it tests after such merge. – Roganik Jan 17 '18 at 08:28
  • @Roganik: Yes, and unfortunately the tests tend to go with the code, which makes it natural and appropriate to put the tests in the same repository and the same commits. As EdwardThomson noted, a merge with `-s ours` will toss both. It seems to be more common that people toss just one, though, due to merge conflicts, so this will likely catch some, perhaps even many/most, occurrences. – torek Jan 17 '18 at 08:32