0

I have already read other questions (1, 2, 3) about how to make git realize about a concrete move, but they don't answer my real doubt: is it possible to manually handle git move understandings while commiting, before the commit, or afterwards (altering the commit internal data). I am open to standard answers, and even to hacking the git repository files.

I am taking this seriously because letting git know a file has been moved, edited, replaced, etc, is very important when lately reviewing the file editions with any software, since the software will be able to accordingly show the file editions no matter which file moves or renames the developer did. I think it's a valuable info the commiter should take care of properly setting, since this way the commit saves more that a FS operation, but also the logical intentions of the developer, the true meaning of the project edits.

Usage cases:

Case 1:

  • move file main_configuration.txt to configurations/production/configuration.txt
  • create main_configuration.txt, add similar content to the previous but change a few lines

git understands that you edited a few lines on configuration.txt and you added a new file configurations/production/configuration.txt. but I don't want to loose track of the production configuration file edits. it isn't a new file created on this commit :(

Case 2:

  • delete file a/a.txt
  • create file b/a.txt with simmilar content

git understands a file move, but I do need the git history to properly explain that a/a.txt has been deleted in this commit, and I need to keep the data that b/a.txt has been created on this commit. It's a very important info and the final info that git tells is a severe mistake which can have analysis consequences.

There are lots of examples and others even more contextualized but I tried to make them as simple as they could be.

Áxel Costas Pena
  • 5,886
  • 6
  • 28
  • 59
  • Git does not track moves or copies at all, so if you find a solution that lets git understand that you moved the file, you will need to apply this solution every time you want this to apply. – Lasse V. Karlsen Jun 21 '18 at 10:42
  • 1
    Basically, if you're looking at a commit where you moved a file, if git is telling you that you moved or renamed that file it is entirely because the tool you're using to look at that commit "works it out" by looking at which files disappeared and which files appeared. There is absolutely no information in the git repository that says "a moved to b", none, nada, zip. Some of these tools have parameter support that instructs this "works it out" algorithm to "work harder", but that's it. If it fails, there is no way for you to store anything in the repository that contains this knowledge. – Lasse V. Karlsen Jun 21 '18 at 10:45
  • This also has a flip-side, if you want to consider deleting a file separate from adding a separate file, there is no way for you to store anything in the repository that keeps this apart because to the git tool that looks at that commit later, it is 100% indistinguishable from the operation of moving and then optionally modifying the file. It *may* help to commit the delete separate from the create, this I don't know, but I *do* know that moves and copies are not tracked. – Lasse V. Karlsen Jun 21 '18 at 10:49
  • So the answer to your question "Is it possible to manually modify move/new/edit mappings on a git commit?" is unfortunately "no". – Lasse V. Karlsen Jun 21 '18 at 10:56
  • @LasseVågsætherKarlsen so the move information which git is able to display in the log command and which can be shown [here](https://stackoverflow.com/a/433156/1670956) is not stored on the commit but computed at display time? – Áxel Costas Pena Jun 21 '18 at 11:05
  • That is correct. A commit is a snapshot of the current state of the repository, beyond a snapshot having a parent snapshot, it does not in any way store the difference between two snapshots. Note that the packing of snapshots into pack files can use diff and compression to store the snapshots more optimal, but the *concept* of a commit is a snapshot, not a list of things that happened. – Lasse V. Karlsen Jun 21 '18 at 11:07
  • And that means that if you manage to configure your git repository or global configuration such that it correctly displays your repository history, for some definition of the word correct, then if I clone the repository, there is no guarantee my configuration will show me the same. – Lasse V. Karlsen Jun 21 '18 at 11:10
  • @LasseVågsætherKarlsen i know that internally it's a bunch of diffs, I simply was wondering wether the diffs could have metadata about file adding, moving etc. So, when I rename an enormous file, git records a deletion of all its lines in the past file reference, and records an addition of lots of lines on the new file reference? – Áxel Costas Pena Jun 21 '18 at 11:12
  • The diffs you're talking about is a storage optimization in the pack files, it's not a diff in the sense of "what did the programmer do". So no, the commits does not have metadata that you can edit. And no, it does not record a deletion of all the lines. If you change a big file and commit that, the immediate commit will be a full snapshot of the new file. However, when the packfile is generated, a diff is usually calculated that can be a lot smaller, however, there is no way for you to say "I deleted these 5 lines", that's calculated after the fact. – Lasse V. Karlsen Jun 21 '18 at 13:32

1 Answers1

1

I'd like to close this as a duplicate of How does git handle moving files in the file system? but you've already referenced that in your question. I think from the comments you've gotten the answer, but let's put one in place formally:

  • Git stores snapshots. Deltas—diffs—do not enter the picture at the level at which Git actually works with the files. (They do occur "below" that level, inside pack-files, as Lasse Vågsæther Karlsen notes in a comment. It's worth mentioning that these deltas, which use a modification of xdelta, are not line-by-line; they're byte-range-by-byte-range. So these are not what Git shows you!)

  • Git does not store the programmer's intent. Git just stores a snapshot of each file; it must, at git diff or git show time, attempt to reconstruct the logical intentions of the developer as you put it.

  • Hence, as you concluded, the move information which Git is able to display in the log command ... is not stored on the commit but computed at display time.

You should think of git diff (and hence git log -p) output as instructions to a computer, or maybe a human, about how to change the file on the left to make it match the file on the right. It doesn't matter how the change actually happened; Git just tries to come up with a minimal(ish) set of instructions to make it happen again, if you want it to happen again. This is true even if you diff the very first commit in the repository against the very last one: Git skips over all the intermediate commits, extracting the first and last snapshot, and computes a change-set that will take you from the first to the last.


As a final conclusion, in order to properly document the developer intents and allow later software-based file edit history, those strategic and maybe misinterpretable changes can be split across several commits so the copy, edit, move or delete operation is explicit and can't be hidden by two actions overlapping. It's up to the developer abilities to finally organize the changes in well documented, understandable, self-explaining and high quality commits.

torek
  • 448,244
  • 59
  • 642
  • 775
  • thank you. I edited your answer in order to add the practical case I finally used for achieving my goals, but the explanation is 100% perfect for the question posted. – Áxel Costas Pena Jun 21 '18 at 21:30
  • 1
    OK. (I'm going to edit for a few typos there and add a dividing horizonal line) – torek Jun 21 '18 at 21:34