2

So I'm on my branch internal_env_board2 and I'm trying to switch to branch battery_board. However, when I attempt to do so I get an error:

$ git checkout battery_board
error: Your local changes to the following files would be overwritten by checkout:
        internal_env_board/internal_env_board2/internal_env_board_with_LCD_BOM.xlsx
Please commit your changes or stash them before you switch branches.
Aborting

I'm pretty annoyed because I know I haven't modified this .xlsx file. The only thing I did was open up the excel file, view its contents, and close it again. For some reason git thinks that there are changes in this file. I tried using git diff to find the differences between the working tree copy and the most recent committed copy and got the following:

$ git diff HEAD internal_env_board_with_LCD_BOM.xlsx
diff --git a/internal_env_board/internal_env_board2/internal_env_board_with_LCD_BOM.xlsx b/internal_env_board/internal_env_board2/internal_env_board_with_LCD_BOM.xlsx
index 0fb3369..10c945e 100644
Binary files a/internal_env_board/internal_env_board2/internal_env_board_with_LCD_BOM.xlsx and b/internal_env_board/internal_env_board2/internal_env_board_with_LCD_BOM.xlsx differ

I'm having a lot of trouble making sense of the output of of this diff command. It just tells me that they differ, but not how.

So I guess my two questions are:

1. Why does git think that I have modified a file just because I opened it?

2. How do I read the output of the git diff HEAD command?

phd
  • 82,685
  • 13
  • 120
  • 165
  • 1
    Whitespaces in a binary **xlsx** files? – phd Dec 12 '18 at 01:46
  • 1
    @phd `.xlsx` files are nothing more than a zip file full of `XML` files, so yes, non-visible characters or bytes are a thing. –  Dec 12 '18 at 01:47
  • 1
    One cannot set up git to ignore whitespace changes inside a zip. – phd Dec 12 '18 at 01:48
  • 4
    1. Why do *you* think that Excel *didn't* modify the file just because you think you only opened it? :-) – paxdiablo Dec 12 '18 at 01:49
  • Extract the `.xlsx` to a directory and compare it to the remote/previous version using something like [Beyond Compare 4](https://www.scootersoftware.com/download.php) git has never been known to lie about differences, unlike SVN. –  Dec 12 '18 at 01:49
  • And just opening an `.xlsx` file may have touched it and changed it in a way you can not see with *normal* tools. That said, git never lies about changes. Just revert it and move on. –  Dec 12 '18 at 01:51
  • 4
    From my personal experience, Excel has a terrible habit of changing a file just by having it open. Annoying, but it's something you need to deal with. – Andrew Fan Dec 12 '18 at 01:51
  • 2
    @phd actually you can, it is not exactly trivial but the `smudge/clean` feature makes it possible. I had to do this exact thing with a `COTS` product at work. It does the same thing, stores its project files as a bunch of `XML` in a `.zip`. Check in the unziped version and have it zip it on checkout. Hacky, but it works. –  Dec 12 '18 at 01:58
  • really you can't read the manual on how git works you want to be spoon fed exactly what to type? git is extensively documented on the internet. this spoon feed me entitlement attitude is why I quit answering questions on SO a long time ago. –  Dec 12 '18 at 16:03

3 Answers3

5

UPDATE for follow-up question in comments


  1. Well, you're probably annoyed with the wrong tool. git thinks a file is modified if the file's stat info has changed - i.e. usually its size or "last modified" timestamp is different now than it was when you checked it out. Why that data changed would be hard for us to say. I can say that some (if not all) versions of Excel are pretty aggressive about defining a "change". I've seen moving the active cell treated as a change, and if such a thing got saved then the file is different even if its useful content is not.

Using the stat information is how git checks for worktree changes because (1) it's reliable for well-behaved programs (since, in particular, the modified date should match if and only if the file hasn't been changed in any reasonable situation); and (2) trying to examine the content would be horribly slow. (git does have some trickery that allows fast content comparison as long as both versions being compared are in the index or committed; but that's another story.) Even a content comparison would be a byte-for-byte check of the file itself - not a cell by cell comparison of the spreadsheet data, since git doesn't know anything about Excel specifically; so if Excel in fact modified the file in any way, git would flag it as modified no matter what, by design.

  1. By default, git diff doesn't try to tell you much about the changes to binary files, because it doesn't assume it can do so in a way that will make sense. In the output you've shown, the first line confirms the filename being compared; the second line gives some housekeeping info about how git identifies the versions of the file; and the third line says that this file is binary and that therefore the default tool can't give you details, beyond saying that they do, in fact, differ. (Which going back to point (1) above, confirms that something - probably Excel - did write a change to the file, which is why git says it's modified.)

If you can find a tool that more meaningfully compares Excel files, you could tell git to use it when comparing versions of this file (via gitattributes); I don't have a recommendation for such a tool, though.


Is there a way to hold a gun to Git and say, "Hey I know there is a difference but discard it because I don't care and go to the new branch anyway"?

If the change has not been added (i.e. status shows it as an unstaged change) you can

git checkout -- path/to/the/file.xslx

If the change has been added (i.e. status shows it as a changed staged for commit), you can

git reset -- path/to/the/file.xslx

and then it will be unstaged and can be handled as above.

For future reference, the output from status does tell you what commands to use to revert changes.

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
  • So now that I understand that Excel probably _did_ make a change and that `git diff` isn't going to tell me a lot of about what actually changed between the most recent committed version of the file and the working tree version of the file, what do I do? I don't want to add a new commit every time I want to `git checkout` a different branch after having opened a troublesome Excel file just to view it. Is there a way to hold a gun to Git and say, "Hey I know there is a difference but discard it because I don't care and go to the new branch anyway"? – Andrew Schroeder Dec 12 '18 at 02:29
2

You can't blame Git, if Excel (in some scenarios) modifies a file each time it opens it.

Also, you can't expect Git to report changes for such a file in a meaningful way: diff gives useful information for textual files, .xlsx are binary files.

See if the answers here and here and here help.

You might also try to confirm that indeed your Excel is modifying (timestamp and/or content) of each .xlsx on mere open - I don't experience that. And, as last resort, you could try to open it some XLSX viewer, or in LibreOffice Calc.

leonbloy
  • 73,180
  • 20
  • 142
  • 190
0

Working Solution

Had the same issue my colleagues and I.

Found a solution which worked on my PC and on my peers computers.

In your git folder

Type 1 of the following commands:

$ git config core.autocrlf false
$ git config core.filemode false

(You may need to type both, depending on the project that you have)

Bonus: If this solution worked with you and you want to do it on your entire PC (all git folders), just add the argument --global after git configure. For ex: git config --global core.autocrlf false

underflow
  • 1,545
  • 1
  • 5
  • 19