3

I am introducing "clean architecture" into an existing codebase which involves moving many files to an "ApplicationDomain" project.

Git has correctly determined the files are being renamed, however, for several of them it is showing all the lines as changed. I can't work out why.

Most of the files have not been modified. Exact same content confirmed by hex dump and hashing. Even the file permissions are the same.

This is frustrating because I don't want to wipe the blame changes for the file.

In a GitLab merge request, for these files it shows the entire contents as changed, even though the contents are identical. For other similarly moved files, GitLab gives "File renamed with no changes". The difference being the former files has similarity index of between 96%-99% whereas the latter files have 100% similarity.

For files that actually have changed as well as moved, Git correctly shows the rename and specific words/lines that have changed.

Any idea on why git is doing this for some files? How do I see what is causing the less than 100% similarity score?

Here is an example output of git diff -M master my-branch:

diff --git a/Project/Service/BlockingXML.xsd b/Project.ApplicationDomain/Service/BlockingXML.xsd
similarity index 96%
rename from Project/Service/BlockingXML.xsd
rename to Project.ApplicationDomain/Service/BlockingXML.xsd
index 92479d70..c3945c8f 100644
--- a/Project/Service/BlockingXML.xsd
+++ b/Project.ApplicationDomain/Service/BlockingXML.xsd
@@ -1,17 +1,17 @@
-<?xml version="1.0" encoding="utf-8"?>
-<xs:schema id="rows" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
-  <xs:element name="rows" msdata:IsDataSet="true" msdata:Locale="en-US">
-    <xs:complexType>
-      <xs:choice minOccurs="0" maxOccurs="unbounded">
-        <xs:element name="row">
-          <xs:complexType>
-            <xs:sequence>
-              <xs:element name="BlockRingOverhang" type="xs:string" minOccurs="0" />
-              <xs:element name="Qty" type="xs:string" minOccurs="0" />
-            </xs:sequence>
-          </xs:complexType>
-        </xs:element>
-      </xs:choice>
-    </xs:complexType>
-  </xs:element>
+<?xml version="1.0" encoding="utf-8"?>
+<xs:schema id="rows" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
+  <xs:element name="rows" msdata:IsDataSet="true" msdata:Locale="en-US">
+    <xs:complexType>
+      <xs:choice minOccurs="0" maxOccurs="unbounded">
+        <xs:element name="row">
+          <xs:complexType>
+            <xs:sequence>
+              <xs:element name="BlockRingOverhang" type="xs:string" minOccurs="0" />
+              <xs:element name="Qty" type="xs:string" minOccurs="0" />
+            </xs:sequence>
+          </xs:complexType>
+        </xs:element>
+      </xs:choice>
+    </xs:complexType>
+  </xs:element>
 </xs:schema>
\ No newline at end of file

I've compared a hex dump of several of the files and they are identical.

As an example:

           Path: C:\Temp\GitChangesProblem\BlockingXML-original.xsd

           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000   EF BB BF 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E  <?xml version
00000010   3D 22 31 2E 30 22 20 65 6E 63 6F 64 69 6E 67 3D  ="1.0" encoding=
00000020   22 75 74 66 2D 38 22 3F 3E 0D 0A 3C 78 73 3A 73  "utf-8"?>..<xs:s
00000030   63 68 65 6D 61 20 69 64 3D 22 72 6F 77 73 22 20  chema id="rows"
00000040   78 6D 6C 6E 73 3D 22 22 20 78 6D 6C 6E 73 3A 78  xmlns="" xmlns:x
00000050   73 3D 22 68 74 74 70 3A 2F 2F 77 77 77 2E 77 33  s="http://www.w3
00000060   2E 6F 72 67 2F 32 30 30 31 2F 58 4D 4C 53 63 68  .org/2001/XMLSch
00000070   65 6D 61 22 20 78 6D 6C 6E 73 3A 6D 73 64 61 74  ema" xmlns:msdat
00000080   61 3D 22 75 72 6E 3A 73 63 68 65 6D 61 73 2D 6D  a="urn:schemas-m
00000090   69 63 72 6F 73 6F 66 74 2D 63 6F 6D 3A 78 6D 6C  icrosoft-com:xml
000000A0   2D 6D 73 64 61 74 61 22 3E 0D 0A 20 20 3C 78 73  -msdata">..  <xs
000000B0   3A 65 6C 65 6D 65 6E 74 20 6E 61 6D 65 3D 22 72  :element name="r
000000C0   6F 77 73 22 20 6D 73 64 61 74 61 3A 49 73 44 61  ows" msdata:IsDa
000000D0   74 61 53 65 74 3D 22 74 72 75 65 22 20 6D 73 64  taSet="true" msd
000000E0   61 74 61 3A 4C 6F 63 61 6C 65 3D 22 65 6E 2D 55  ata:Locale="en-U
000000F0   53 22 3E 0D 0A 20 20 20 20 3C 78 73 3A 63 6F 6D  S">..    <xs:com
00000100   70 6C 65 78 54 79 70 65 3E 0D 0A 20 20 20 20 20  plexType>..
00000110   20 3C 78 73 3A 63 68 6F 69 63 65 20 6D 69 6E 4F   <xs:choice minO
00000120   63 63 75 72 73 3D 22 30 22 20 6D 61 78 4F 63 63  ccurs="0" maxOcc
00000130   75 72 73 3D 22 75 6E 62 6F 75 6E 64 65 64 22 3E  urs="unbounded">
00000140   0D 0A 20 20 20 20 20 20 20 20 3C 78 73 3A 65 6C  ..        <xs:el
00000150   65 6D 65 6E 74 20 6E 61 6D 65 3D 22 72 6F 77 22  ement name="row"
00000160   3E 0D 0A 20 20 20 20 20 20 20 20 20 20 3C 78 73  >..          <xs
00000170   3A 63 6F 6D 70 6C 65 78 54 79 70 65 3E 0D 0A 20  :complexType>..
00000180   20 20 20 20 20 20 20 20 20 20 20 3C 78 73 3A 73             <xs:s
00000190   65 71 75 65 6E 63 65 3E 0D 0A 20 20 20 20 20 20  equence>..
000001A0   20 20 20 20 20 20 20 20 3C 78 73 3A 65 6C 65 6D          <xs:elem
000001B0   65 6E 74 20 6E 61 6D 65 3D 22 42 6C 6F 63 6B 52  ent name="BlockR
000001C0   69 6E 67 4F 76 65 72 68 61 6E 67 22 20 74 79 70  ingOverhang" typ
000001D0   65 3D 22 78 73 3A 73 74 72 69 6E 67 22 20 6D 69  e="xs:string" mi
000001E0   6E 4F 63 63 75 72 73 3D 22 30 22 20 2F 3E 0D 0A  nOccurs="0" />..
000001F0   20 20 20 20 20 20 20 20 20 20 20 20 20 20 3C 78                <x
00000200   73 3A 65 6C 65 6D 65 6E 74 20 6E 61 6D 65 3D 22  s:element name="
00000210   51 74 79 22 20 74 79 70 65 3D 22 78 73 3A 73 74  Qty" type="xs:st
00000220   72 69 6E 67 22 20 6D 69 6E 4F 63 63 75 72 73 3D  ring" minOccurs=
00000230   22 30 22 20 2F 3E 0D 0A 20 20 20 20 20 20 20 20  "0" />..
00000240   20 20 20 20 3C 2F 78 73 3A 73 65 71 75 65 6E 63      </xs:sequenc
00000250   65 3E 0D 0A 20 20 20 20 20 20 20 20 20 20 3C 2F  e>..          </
00000260   78 73 3A 63 6F 6D 70 6C 65 78 54 79 70 65 3E 0D  xs:complexType>.
00000270   0A 20 20 20 20 20 20 20 20 3C 2F 78 73 3A 65 6C  .        </xs:el
00000280   65 6D 65 6E 74 3E 0D 0A 20 20 20 20 20 20 3C 2F  ement>..      </
00000290   78 73 3A 63 68 6F 69 63 65 3E 0D 0A 20 20 20 20  xs:choice>..
000002A0   3C 2F 78 73 3A 63 6F 6D 70 6C 65 78 54 79 70 65  </xs:complexType
000002B0   3E 0D 0A 20 20 3C 2F 78 73 3A 65 6C 65 6D 65 6E  >..  </xs:elemen
000002C0   74 3E 0D 0A 3C 2F 78 73 3A 73 63 68 65 6D 61 3E  t>..</xs:schema>

and

           Path: C:\Temp\GitChangesProblem\BlockingXML-moved.xsd

           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000   EF BB BF 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E  <?xml version
00000010   3D 22 31 2E 30 22 20 65 6E 63 6F 64 69 6E 67 3D  ="1.0" encoding=
00000020   22 75 74 66 2D 38 22 3F 3E 0D 0A 3C 78 73 3A 73  "utf-8"?>..<xs:s
00000030   63 68 65 6D 61 20 69 64 3D 22 72 6F 77 73 22 20  chema id="rows"
00000040   78 6D 6C 6E 73 3D 22 22 20 78 6D 6C 6E 73 3A 78  xmlns="" xmlns:x
00000050   73 3D 22 68 74 74 70 3A 2F 2F 77 77 77 2E 77 33  s="http://www.w3
00000060   2E 6F 72 67 2F 32 30 30 31 2F 58 4D 4C 53 63 68  .org/2001/XMLSch
00000070   65 6D 61 22 20 78 6D 6C 6E 73 3A 6D 73 64 61 74  ema" xmlns:msdat
00000080   61 3D 22 75 72 6E 3A 73 63 68 65 6D 61 73 2D 6D  a="urn:schemas-m
00000090   69 63 72 6F 73 6F 66 74 2D 63 6F 6D 3A 78 6D 6C  icrosoft-com:xml
000000A0   2D 6D 73 64 61 74 61 22 3E 0D 0A 20 20 3C 78 73  -msdata">..  <xs
000000B0   3A 65 6C 65 6D 65 6E 74 20 6E 61 6D 65 3D 22 72  :element name="r
000000C0   6F 77 73 22 20 6D 73 64 61 74 61 3A 49 73 44 61  ows" msdata:IsDa
000000D0   74 61 53 65 74 3D 22 74 72 75 65 22 20 6D 73 64  taSet="true" msd
000000E0   61 74 61 3A 4C 6F 63 61 6C 65 3D 22 65 6E 2D 55  ata:Locale="en-U
000000F0   53 22 3E 0D 0A 20 20 20 20 3C 78 73 3A 63 6F 6D  S">..    <xs:com
00000100   70 6C 65 78 54 79 70 65 3E 0D 0A 20 20 20 20 20  plexType>..
00000110   20 3C 78 73 3A 63 68 6F 69 63 65 20 6D 69 6E 4F   <xs:choice minO
00000120   63 63 75 72 73 3D 22 30 22 20 6D 61 78 4F 63 63  ccurs="0" maxOcc
00000130   75 72 73 3D 22 75 6E 62 6F 75 6E 64 65 64 22 3E  urs="unbounded">
00000140   0D 0A 20 20 20 20 20 20 20 20 3C 78 73 3A 65 6C  ..        <xs:el
00000150   65 6D 65 6E 74 20 6E 61 6D 65 3D 22 72 6F 77 22  ement name="row"
00000160   3E 0D 0A 20 20 20 20 20 20 20 20 20 20 3C 78 73  >..          <xs
00000170   3A 63 6F 6D 70 6C 65 78 54 79 70 65 3E 0D 0A 20  :complexType>..
00000180   20 20 20 20 20 20 20 20 20 20 20 3C 78 73 3A 73             <xs:s
00000190   65 71 75 65 6E 63 65 3E 0D 0A 20 20 20 20 20 20  equence>..
000001A0   20 20 20 20 20 20 20 20 3C 78 73 3A 65 6C 65 6D          <xs:elem
000001B0   65 6E 74 20 6E 61 6D 65 3D 22 42 6C 6F 63 6B 52  ent name="BlockR
000001C0   69 6E 67 4F 76 65 72 68 61 6E 67 22 20 74 79 70  ingOverhang" typ
000001D0   65 3D 22 78 73 3A 73 74 72 69 6E 67 22 20 6D 69  e="xs:string" mi
000001E0   6E 4F 63 63 75 72 73 3D 22 30 22 20 2F 3E 0D 0A  nOccurs="0" />..
000001F0   20 20 20 20 20 20 20 20 20 20 20 20 20 20 3C 78                <x
00000200   73 3A 65 6C 65 6D 65 6E 74 20 6E 61 6D 65 3D 22  s:element name="
00000210   51 74 79 22 20 74 79 70 65 3D 22 78 73 3A 73 74  Qty" type="xs:st
00000220   72 69 6E 67 22 20 6D 69 6E 4F 63 63 75 72 73 3D  ring" minOccurs=
00000230   22 30 22 20 2F 3E 0D 0A 20 20 20 20 20 20 20 20  "0" />..
00000240   20 20 20 20 3C 2F 78 73 3A 73 65 71 75 65 6E 63      </xs:sequenc
00000250   65 3E 0D 0A 20 20 20 20 20 20 20 20 20 20 3C 2F  e>..          </
00000260   78 73 3A 63 6F 6D 70 6C 65 78 54 79 70 65 3E 0D  xs:complexType>.
00000270   0A 20 20 20 20 20 20 20 20 3C 2F 78 73 3A 65 6C  .        </xs:el
00000280   65 6D 65 6E 74 3E 0D 0A 20 20 20 20 20 20 3C 2F  ement>..      </
00000290   78 73 3A 63 68 6F 69 63 65 3E 0D 0A 20 20 20 20  xs:choice>..
000002A0   3C 2F 78 73 3A 63 6F 6D 70 6C 65 78 54 79 70 65  </xs:complexType
000002B0   3E 0D 0A 20 20 3C 2F 78 73 3A 65 6C 65 6D 65 6E  >..  </xs:elemen
000002C0   74 3E 0D 0A 3C 2F 78 73 3A 73 63 68 65 6D 61 3E  t>..</xs:schema>

Update

I have run the commands for several of the problematic files as suggested by @LeGEC.

E.g: git show master:Project/BlockingXML.cs | Format-Hex and git show current:Project.ApplicationDomain/BlockingXML.cs | Format-Hex

Line endings are the same. Characters at the start of the file are the same. For some files, the contents is exactly the same. For others, the only difference is a change in the C# namespace. In theory, git should be able to see this as a file move, and display only the modified lines, as it has done with plenty of other files.

br3nt
  • 9,017
  • 3
  • 42
  • 63
  • 5
    This is definitely a CRLF vs LF-only line endings difference. The files in your *working trees* are both CRLF-line-ending format, but the ones in Git itself must be different: one has CRLF endings and one has LF-only endings. Since you compare them by extracting them to your working tree, which turns the LF-only one into CRLF, they compare the same. But Git compares the internal copies, not the externalized, CR-added copies. – torek Oct 13 '21 at 09:04
  • 3
    It seems likely that the files were originally committed with line-ending-whacking turned *off*, and with CRLF line endings, so that the original copies inside Git are in their original complete-with-CR format. Since then, someone adjusted some repository settings so that newly added files get stored as LF-only. Moving the file caused Git to re-copy to internal format, this time stripping the CRs. Those are now committed as CR-free copies. – torek Oct 13 '21 at 09:06
  • 1
    @br3nt : did you get those files on disk by running `git checkout {xxx}` ? if yes : some processing is applied on files on checkout -- for example : the line endings are fixed on checkout ... To compare what `git` has in its internal storage : try `git show master:that/file | hd`, and compare it with `git show my-branch:that/file | hd` – LeGEC Oct 13 '21 at 09:22
  • I had this problem related to end-of-line (crlf vs of) a while ago in some of my projects and was able to correct it going forward by adding a `.gitattributes` file. Take a look at https://stackoverflow.com/questions/7893599/how-to-turn-off-git-warnings-lf-will-be-replaced-by-crlf and https://git-scm.com/docs/gitattributes. – Jonathon S. Oct 13 '21 at 09:25
  • Thanks everyone for the comments. It’s entirely possible the settings were changed. It’s an old project and has also changed repos a couple years ago. I will run the cmds that LeGEC suggested and report back. I’m not at my PC atm so will be tomorrow. Thanks again. – br3nt Oct 13 '21 at 11:12
  • I ran the commands in PS: `git show master:Project/BlockingXML.xsd | Format-Hex` and `git show my-branch:Project.ApplicationDomain/BlockingXML.xsd | Format-Hex` and it confirms the files have the same content. – br3nt Oct 15 '21 at 00:58
  • I ran the `git show` command on several of the files showing as moved but with all lines modified. I can clearly see some of the files are exactly the same, for others, I can see minimal changes such as change of the c# namespace. In theory git should be showing these files as simple moved with no changes, or showing the specific lines that were modified. I just don't understand. I'll update the questions with the details. – br3nt Oct 15 '21 at 01:08

1 Answers1

-1

A similarity score of less than 100% means that the BLOB object IDs for the file before and after the commit are different. Find the object IDs of them and compare the output of "git cat-file -p" for each of them.

Yoichi Nakayama
  • 704
  • 6
  • 9