125

Is git good with binary files?

If I have a lot of uncompressed files being modified, and many compressed files never (or almost never) modified, would git handle it well? For example, if I insert or remove the middle and insert data near the end it will notice it as it does with text?

If git isn't good with binary files, what tool might I consider?

TRiG
  • 10,148
  • 7
  • 57
  • 107
  • 1
    very good with binary - i use it myself – tekknolagi Jan 15 '11 at 00:09
  • It's kinda true. You can put your /home under git revision and it should work pretty well. – Loïc Faure-Lacroix Jan 15 '11 at 00:12
  • 2
    This is not in the spirit of the question, which was clearly noted as being concerned about whether binary files had diff's done on them (probably for repository bloat and performance reasons). However, I didn't downvote it (and it appears whoever did has since removed it). – coreyward Jan 15 '11 at 00:17
  • 1
    Note: you now have git-lts, to store your binaries elsewhere: http://stackoverflow.com/a/29530784/6309 – VonC Apr 09 '15 at 05:57
  • 2
    Does it bloats the .git folder? – Porcupine May 08 '18 at 07:51

5 Answers5

66

Out of the box, git can easily add binary files to its index, and also store them in an efficient way unless you do frequent updates on large uncompressable files.

The problems begin when git needs to generate diffs and merges: git cannot generate meaningful diffs, or merge binary files in any way that could make sense. So all merges, rebases or cherrypicks involving a change to a binary file will involve you making a manual conflict resolution on that binary file.

You need to decide whether the binary file changes are rare enough that you can live with the extra manual work they cause in the normal git workflow involving merges, rebases, cherrypicks.

ndim
  • 35,870
  • 12
  • 47
  • 57
  • 33
    I'd have to point out that binary files changes aren't a problem, making changes in multiple places and then trying to merge them is. – Winston Ewert Jan 15 '11 at 00:24
  • 22
    git can generate meaningful diffs. A diff created with `git diff --binary` will be able to patch binary files. – CB Bailey Jan 15 '11 at 01:23
54

In addition to other answers.

  • You can send a diff to binary file using so called binary diff format. It is not human-readable, and it can only be applied if you have exact preimage in your repository, i.e. without any fuzz.
    An example:

    diff --git a/gitweb/git-favicon.png b/gitweb/git-favicon.png
    index de637c0608090162a6ce6b51d5f9bfe512cf8bcf..aae35a70e70351fe6dcb3e905e2e388cf0cb0ac3 100
    GIT binary patch
    delta 85
    zcmZ3&SUf?+pEJNG#Pt9J149GD|NsBH{?u>)*{Yr{jv*Y^lOtGJcy4sCvGS>LGzvuT
    nGSco!%*slUXkjQ0+{(x>@rZKt$^5c~Kn)C@u6{1-oD!M<s|Fj6
    
    delta 135
    zcmXS3!Z<;to+rR3#Pt9J149GDe=s<ftM(tr<t*@sEM{Qf76xHPhFNnYfP!|OE{-7;
    zjI0MY3OYE5upapO?DR{I1pyyR7cx(jY7y^{FfMCvb5IaiQM`NJfeQjFwttKJyJNq@
    hveI=@x=fAo=hV3$-MIWu9%vGSr>mdKI;RB2CICA_GnfDX
    
  • You can use textconv gitattribute to have git diff show human-readable diff for binary files, or parts of binary files. For example for *.jpg files it can be difference in EXIF information, for PDF files it can be difference between their text representation (pdf2text or something like that).

HTH.

Jakub Narębski
  • 309,089
  • 65
  • 217
  • 230
17

If you've got really large binary files, you can use git-annex to store the data outside of the repository. Check out: http://git-annex.branchable.com/

John Gibb
  • 10,603
  • 2
  • 37
  • 48
  • 6
    Git-annex is quite wonderful, but probably better suited for files that *do not change all that often*, e.g. a collection of music files, pictures, PDFs,... – sr_ Feb 05 '13 at 10:10
  • @sr_ exactly, so does Git LFS. It seems there is no version control system suitable for these type of use-cases while also having a distributed system as base (like Git). – Marc J. Schmidt Apr 04 '20 at 20:12
5

Well git is good with binaries. But it won't handle binaries like text files. It's like you want to merge binary files. I mean, a diff on a jpeg will never return you anything. Git works very well with text file and probably as bad as every other solution with binary files!

Loïc Faure-Lacroix
  • 13,220
  • 6
  • 67
  • 99
5

if you want a solution for versioning you might wanna consider git-lfs that has a lightweight pointer to your file.

it means when you clone your repo it doesnt download all the versions but only the one that is checked-out.

Here's a nice tutorial of how to use it

danfromisrael
  • 2,982
  • 3
  • 30
  • 40