12

I struggled with some line-ending problems about 20 commits back and some weird things happened. Now git fsck shows:

Checking object directories 100% (256/256), done.
error in tree ee2060e71cb36d33be5ddc1fe9ca8d7dd0ab35cd: contains duplicate file entries
Checking objects: 100% (8633/8633), done.

and git show ee2060 shows:

File1.cs
File2.cs
File2.cs
File2.cs
File3.cs

This is preventing me from pushing to my remote. git push shows:

error: unpack failed: index-pack abnormal exit
To https://github.com/username/Project.git
 ! [remote rejected] master -> master (n/a (unpacker error))
error: failed to push some refs to 'https://github.com/username/Project.git'

I have tried repacking and garbage collecting. How can I resolve this problem?

Christopher Best
  • 466
  • 4
  • 14

5 Answers5

9

I used git-replace and git-mktree to fix this in the past. You essentially keep the broken tree object, but override all links and make them point to a new object.

  1. First we grab the bad tree:git ls-tree bad_tree_hash > tmpfile.txt This writes out your bad tree. For example:

     040000·tree·3cdcc756ee0ed636c44828927126911d0ab28a18 →  xNotAlphabetic
     040000·tree·4ad0d8ef014b8cc09c95694399254eff43217bfb →  EXT
     040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e →  duplicateFolder
     040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e →  duplicateFolder
     040000·tree·fd0661d698ace91135a8473b26707892b7c89c32 →  ToolTester
     040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e →  duplicateFolder
    

NB, · & → are whitespace [space] and [tab]

  1. Next, edit the text, removing the offending lines, and save with Unix-style endings (ie only LF, not CRLF). With this example, we make this:

     040000·tree·4ad0d8ef014b8cc09c95694399254eff43217bfb →  EXT
     040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e →  duplicateFolder
     040000·tree·fd0661d698ace91135a8473b26707892b7c89c32 →  ToolTester
     040000·tree·3cdcc756ee0ed636c44828927126911d0ab28a18 →  xNotAlphabetic
    
  2. Type cat tmpfile.txt | git mktree which will make a new, fixed tree object and save it, and return the new hash: a55115e4a05ea9ac8b79e37b872024d64d4r2c2e a.k.a. for demo purposes new_tree_hash

  3. Next git replace will create a new reference, which forces all previously incident links to use the new, fixed object instead. git replace bad_tree_hash new_tree_hash

This will solve your immediate problem. If you're interested, look at the overriding link in the .git/refs/replace folder.


The bad tree object will continue to generate warnings whenever you do a check on your repository with git fsck, but it can be ignored, and all your commits and other links will be consistent and working regardless.


8 year retrospective: There's probably a way to just delete the old, corrupt tree since git replace should make it moot.

Further warning: This hack could also be rejected by a git service eg BitBucket or GitHub, since they could view it as corruption.

TonyH
  • 1,117
  • 8
  • 18
  • If I'm getting this duplicate error, but able to push and pull, can I ignore those messages? – arielma Mar 09 '20 at 12:46
  • I think so, since you'll never "lose" anything. It could make blame or other history things wonky in the future. Overall, the duplication only really warrants the warning. – TonyH Mar 09 '20 at 13:03
  • This answer had such promise, but even after the git replace, my new host (GitHub) refused to accept the pushed clone that I get from my old host (bitbucket) – Paul Bruneau Apr 21 '20 at 20:02
  • @Paul Any eventual success? – TonyH May 27 '20 at 14:36
  • 1
    @TonyH yes, I had to use https://rtyley.github.io/bfg-repo-cleaner/ as referenced elsewhere in this page. It was the only thing that would work. It was a little scary but it did work. – Paul Bruneau Sep 13 '21 at 15:49
  • 1
    @Paul Glad you got eventual success. My answer may leave too much corruption for a hosted git service, so I've noted that in my answer. – TonyH Sep 20 '21 at 16:06
5

I finally fixed the repo by doing the following

  1. do a fresh clone from github, which only included commits before the problem occurred
  2. add my messed up repo from the filesystem as a remote on the new clone
  3. painstakingly check out commits from the bad repo into the working copy of the new clone

    git checkout fe3254FIRSTCOMMITAFTERORIGIN/MASTER/HEAD . // note the dot at the end
    // without the dot, you move your head to the commit instead of the commit
    // to the working copy, and seems to bring the corrupt object into your good clone
    
  4. commit each in turn, manually copying the commit message from the other repo
  5. remove the corrupt repo from remotes
  6. garbage collect + prune

    git gc --aggressive --prune=now
    
  7. weep happily as git fsck shows no duplicate file entries
Christopher Best
  • 466
  • 4
  • 14
  • Why suggesting --aggressive? The only thing it does is ignoring all previous collected delta information. More info: http://metalinguist.wordpress.com/2007/12/06/the-woes-of-git-gc-aggressive-and-how-git-deltas-work/ – riezebosch Jan 06 '14 at 12:40
  • @riezebosch I don't remember why I included --aggressive or whether or not it was required to fix the problem. – Christopher Best Jan 08 '14 at 16:18
  • 1
    Okay, I can imagine it is needed to completely rebuild all deltas to get rid of dangling commits. – riezebosch Jan 09 '14 at 10:59
2

I had a problem of this ilk and all the solutions here and in other SO threads failed to fix it for me. In the end I used BFG repo cleaner to destroy all the commits which references the bad folder name, which was probably overkill but successfully repaired the repo.

Henry Wilson
  • 3,281
  • 4
  • 31
  • 46
1

checkout a new branch just before the problematic commit. now checkout the files from the problematic commit. Now add and commit them using the same message ( use the -C option ). Repeat for the rest of the commits. After you're done, reset the other branch to point to this correct one. You can then push.

Adam Dymitruk
  • 124,556
  • 26
  • 146
  • 141
  • Ok, I checked out before the problem and cherry-picked back the commits in order, skipping bad commits and I reset master to the new branch. However, git fsck still shows the duplicate entries, even after trying everything I could think of to force pruning, etc. As a sanity check, I checked out before the problem was introduced and removed all commits after that. The problem is still there. So I wrote a bash script to recursively search commit trees to find the bad tree, and it is not referenced by any of them. Is there a way to get git to prune unreferenced objects that I don't know about? – Christopher Best Jun 08 '12 at 14:45
  • as long as you have no reference to the bad commits, you will have them gone. just do a `git gc --aggressive --prune=now` once your references are sorted due to the reflog no longer caching them. – Adam Dymitruk Jun 12 '12 at 02:34
0

Rebasing your commits again might fix it. If that does not help then you can use git low-level commands (git-cat-file) to see what commits contains this weird tree object, and reconstruct put there a correct version of the tree without the duplicates. However, I don't know of any automatic tools that might be able to fix this, and you'll probably have to change all the tree and commit object that already link to the weird one.

By the way, git ls-tree ee2060 should show you more details about the data that are in the damaged tree, such as files that are referenced there.

che
  • 12,097
  • 7
  • 42
  • 71