9

I'm trying to use Git and GitHub to sync a number of app configuration files. These are XML or plist files stored in a binary format. For example, a Keyboard Maestro .kmsync file.

I can open these files via a text editor to see an XML format.

But when I view these file diffs in a GitHub Pull Request, commit view, etc. I see a useless binary diff with no visible changes:

Showing with 0 additions and 0 deletions.
BIN +17 Bytes (100%)
Binary file not shown.

I can get the a text-based diff to display locally via git via a .gitattributes file. However, it appears that GitHub doesn't respect these modifications:

GitHub doesn't use .gitattributes files for choosing which files to show in a diff, so it's not possible to get around this that way. [source]

I want to see the text-based changes and line diffs when I view these files on GitHub in my commits and Pull Requests.

For example, the GitHub PR here. Feel free to fork and experiment:
https://github.com/pkamb/so/pull/1

How can I convince the web view of a GitHub repo to use text-based diffing for certain "binary" files?


I cannot find an existing question for my specific ask (displaying a non-binary diff on GitHub).

The following questions relate to for this same behavior, but for local git (not GitHub).

My question is the opposite of this question, which seeks to display text files as binary files on GitHub:

pkamb
  • 33,281
  • 23
  • 160
  • 191
  • How about a link to the corresponding `.kmsync` file(s) in your repository and also a link to your `.gitattributes` file? That would make the problem easier to reproduce and less theoretical. I do not think it is a good idea to expect all people viewing your question (42 already at the time of writing this) to each create a sample repository, trying to replicate your situation. – kriegaex Oct 13 '20 at 12:54
  • @kriegaex good idea. I've linked to an active PR where this can be tested; feel free to fork. I have not yet pushed any `.gitattributes` changes to this repo. https://github.com/pkamb/so/pull/1 – pkamb Oct 13 '20 at 16:59
  • 1
    I just checked. The `.kmsync` file is a binary file! I thought it is an XML file accidentally treated as a binary. You said you can open that file in a text editor and see XML. That is impossible, unless your editor knows how to expand the binary format. How would GitHub know that? Please explain. I think you are asking a bit too much there. A `.gitattributes` file cannot make a binary file magically into a text file, only tell Git which files to treat as binary or text, in case it does not do the right thing automatically. This does not work locally in Git, so why would it work on GitHub? – kriegaex Oct 14 '20 at 03:44
  • @kriegaex details for the binary vs. XML format for this file type are described [here](https://forum.keyboardmaestro.com/t/plain-text-kmsync/1271/2). A text editor can [open these files fine](https://i.stack.imgur.com/V9733.png) as text/XML via the `Open With...` menu. I believe this is exactly what `.gitattributes` are for? Telling to diff as text vs. a binary. – pkamb Oct 14 '20 at 03:57
  • 1
    Then it is a special editor or an editor with a special plugin for that kind of file. Maybe under the hood it uses `plutil -convert` as mentioned in the thread you linked to. My editors here on Windows and my IDE definitely cannot open it. – kriegaex Oct 14 '20 at 04:26
  • 3
    What you should do instead is to save XML files in your repository and convert them to binary format during the build or deploy process, if you need those files, not the other way around. This is an SCM (source code management) best practice. Then you would not have any headaches concerning diffs anymore either. – kriegaex Oct 14 '20 at 04:30
  • @kriegaex local `git` can be set to diff binary files using an external tool, as is shown here: https://stackoverflow.com/a/15231630/1265393 . I could likely do the same via `plutil` for these files. But how to show it on GitHub... – pkamb Oct 14 '20 at 05:17
  • I know Git can be configured to locally diff binary files. But you cannot run your local tool on the GitHub server. Like I said, don't make a simple thing complicated. Commit XML files and generate the binaries as needed, not the other way around. Why would you customise all client workstations and even GitHub with a proprietary tool that might not even be available on all platforms just so as to be able to display binary files if there is a simple way to commit text files instead and generate the binaries from them? – kriegaex Oct 14 '20 at 05:36
  • 1
    I don't think this is possible on GitHub. There is special handling for some file types like pdf but I don't think you can configure this. – dan1st Oct 14 '20 at 05:56

1 Answers1

1

There isn't a way to force GitHub to display these files as text because they are not. When GitHub renders files as part of an HTML page, they must be in some encoding, and the only reasonable choice for encodings these days is UTF-8. These files cannot be displayed as-is as UTF-8 because they contain byte sequences that are not valid in UTF-8, in addition to control characters, which generally cannot be rendered well in a web page.

It is possible to convert these files to text for diffing using a .gitattributes file using the diff type and the diff.*.textconv attribute in your config file. This works great on your machine, but it won't work on GitHub. First of all, GitHub doesn't have your tool for rendering files, and secondly, GitHub doesn't support external programs for rendering files in general, mostly for security reasons. Some common formats are supported, but this is not one of them.

Also note that the program to be used is stored in the Git configuration and not in the .gitattributes file; this is intentional, since shipping a list of programs to execute in the repository is a security problem. Therefore, GitHub can't possibly even know the program you'd be using here.

If your kmsync files have a plain text equivalent that you can compile into the binary format, then you can store that format in the repository and build it as part of a build step. That will be diffable and will still provide the binary formats that you can use for your project. This is no different than compiling code into binaries or plain text into PDFs.

bk2204
  • 64,793
  • 6
  • 84
  • 100
  • GitHub [seems to](https://stackoverflow.com/a/24382933/1265393) use their `linguist` tool as a gitattributes alternative; is there any possibility of using that tool to diff binary data? – pkamb Oct 15 '20 at 16:51
  • Linguist is used for detecting the language of files. It's possible to configure it in `.gitatrributes` in various ways, but none of those options can force a file to be rendered as plain text when it's not for the reasons mentioned above. – bk2204 Oct 16 '20 at 01:48