We have a big git repository, which I want to push to a self-hosted gitlab instance.
The problem is that the gitlab remote does not let me push my repo:
git push --mirror https://mygitlab/xy/myrepo.git
This will give me this error:
Enumerating objects: 1383567, done.
Counting objects: 100% (1383567/1383567), done.
Delta compression using up to 8 threads
Compressing objects: 100% (207614/207614), done.
remote: error: object c05ac7f76dcd3e8fb3b7faf7aab9b7a855647867:
duplicateEntries: contains duplicate file entries
remote: fatal: fsck error in packed object
So I did a git fsck:
error in tree c05ac7f76dcd3e8fb3b7faf7aab9b7a855647867: duplicateEntries: contains duplicate file entries
error in tree 0d7286cedf43c65e1ce9f69b74baaf0ca2b73e2b: duplicateEntries: contains duplicate file entries
error in tree 7f14e6474400417d11dfd5eba89b8370c67aad3a: duplicateEntries: contains duplicate file entries
Next thing I did was to check git ls-tree c05ac7f76dcd3e8fb3b7faf7aab9b7a855647867
:
100644 blob c233c88b192acfc20548d9d9f0c81c48c6a05a66 fileA.cs
100644 blob 5d6096cb75d27780cdf6da8a3b4d357515f004e0 fileB.cs
100644 blob 5d6096cb75d27780cdf6da8a3b4d357515f004e0 fileB.cs
100644 blob d2a4248bcda39c0dc3827b495f7751b7cc06c816 fileC.xaml
Notice that fileB.cs
is displayed twice, with the same hash. I assume that this is the problem, because why would the file be two times in the same tree with the same file name and blob hash?
Now I googled the problem but could not find a way how to fix this. One seemingly good resource I found was this: Tree contains duplicate file entries
However, it basically comes down to using git replace which does not really fix the problem, so git fsck will still print the error and prevent me from pushing to the remote.
Then there is this one which seems to remove the file entirely (but I still need the file, but only once, not twice in the tree): https://stackoverflow.com/a/44672692/826244
Is there any other way to fix this? I mean it really should be possible to fix so that git fsck does not throw any errors, right? I am aware that I will need to rewrite the entire history after the corrupted commits. I could not even find a way to get the commit that points to the specific trees, otherwise I might be able to use rebase and patching the corrupted commit or something. Any help would be greatly appreciated!
UPDATE: Pretty sure I know what to do, but not yet how to do it:
- Creating a new tree object from the old tree, but corrected with
git mktree
<- done - Create a new commit that is identical to the old one that references the bad tree but with the newly fixed tree <- difficult, I cannot easily get the commit to the tree, my current solution runs like an hour or more and I do not know how to create the modified commit then, once I have found it
- Run
git filter-branch -- --all
<- Should persist the replacements of the commits
Sadly I cannot just use git replace --edit
on the bad tree and then run git filter-branch -- --all
because filter-branch
seems to only work on commits, but ignores tree-replaces...