0

I'm trying to commit a file of size up-to 10-20 GB to GIT repository. But GIT isn't allowing me to do so.

Getting following error:

fatal: confused by unstable object source data for d7cf3fdd8ba744a3ba85c70da114d5b4afbde390

Does GIT allow such individual huge file size to be committed? Or is there any restriction on maximum individual file size that can be committed to GIT?

If yes, then is there a way to overcome this restriction?

user3160460
  • 43
  • 2
  • 7
  • Out of curiosity what are you adding that is so massive? – AD7six Mar 11 '14 at 06:54
  • Its kind of CSV data having network routing information inside them. – user3160460 Mar 11 '14 at 06:55
  • Can you motivate why you need the file to be versioned with git? Does it change often? Is the history important? I guess that you will probably not read too many of the diffs if a lot of those 10 GB changes? – evnu Mar 11 '14 at 07:08
  • As I mentioned it contains network routing information, it is expected to change often. History is also important (that's why thought of GIT as a version control) – user3160460 Mar 11 '14 at 07:13
  • 1
    Is it changing while you are actually trying to commit it? – CB Bailey Mar 11 '14 at 07:17
  • No I don't think its changing. There are no modifications during commit. – user3160460 Mar 11 '14 at 07:28

1 Answers1

4

This message comes from this bit of git source code:

    git_SHA1_Final(parano_sha1, &c);
    if (hashcmp(sha1, parano_sha1) != 0)
            die("confused by unstable object source data for %s", sha1_to_hex(sha1));

What this means is that the contents of the file changed between the time git first looked at it (to determine the file's content-based SHA-1 object-name) and the time git was able to make a compressed "loose" object out of it.

That would happen if something is actively modifying the file while you're trying to add-and-commit it. Git needs a "stable snapshot" version (lock the file, or make a "safe" copy that won't change while git digests it, or some such).

That said, there are limits on the size of "reasonable" files in a git repository. See this answer by VonC (it has another link to a more detailed answer, also by VonC).

In the past, I worked with 2-4GB "files" within a repository, and they worked, but we were already abusing the idea of a "git repository" by then. These would also sometimes blow out the memory limits on very small servers: the problem is that the deltifier in the pack-file builder tries to mmap everything. On bigger machines, you can make bigger pack-files, and then the smaller machines just break.

If you have enough RAM, it's possible. I would recommend against it though, at least until git has better large-file-handling algorithms.

Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775
  • @user3160460, consider looking at [`git-annex`](https://git-annex.branchable.com/). – kostix Mar 11 '14 at 09:27
  • @kostix: git-annex seems to be good solution to my problem.. Evaluating it from my solution perspective.. :) Thanks for the suggestion (y) – user3160460 Mar 11 '14 at 18:18