2

I have a repo which is 1.4Gb, altho the actual code is more like 250Mb. I'm trying to use bfg to get the size down.

When running git clone --mirror https://xxx@bitbucket.org/xxx/my-app.git

I'm getting:

16:13:59.706806 pkt-line.c:80           packet:     sideband< \2Compressing objects:  99% (2662/2679)\15
16:14:00.093716 pkt-line.c:80           packet:     sideband< \2Compressing objects: 100% (2679/2679)\15
16:14:00.094381 pkt-line.c:80           packet:     sideband< \2Compressing objects: 100% (2679/2679), done.
remote: Compressing objects: 100% (2679/2679), done.
16:14:00.126076 pkt-line.c:80           packet:     sideband< PACK ...
16:14:00.126414 run-command.c:664       trace: run_command: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 22026 on My-Mac-mini' --check-self-contained-and-connected --pack_header=2,43919
16:14:00.144702 exec-cmd.c:139          trace: resolved executable path from Darwin stack: /Applications/Xcode-13.0.0-Release.Candidate.app/Contents/Developer/usr/libexec/git-core/git
16:14:00.146271 exec-cmd.c:238          trace: resolved executable dir: /Applications/Xcode-13.0.0-Release.Candidate.app/Contents/Developer/usr/libexec/git-core
16:14:00.147252 git.c:444               trace: built-in: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 22026 on My-Mac-mini' --check-self-contained-and-connected --pack_header=2,43919
16:21:03.324329 http.c:756              == Info: Connection #0 to host bitbucket.org left intact
16:21:03.324455 pkt-line.c:80           packet:          git< 0000
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: index-pack failed

I have the following env vars in place:

export GIT_TRACE_PACKET=1                                                        
export GIT_TRACE=1
export GIT_CURL_VERBOSE=1

and the following .gitconfig:

[core]
        excludesfile = /Users/lewissmith/.gitignore_global
        compression = 0
        packedGitLimit = 512m
        packedGitWindowSize = 512m

[commit]
        template = /Users/xxx/.stCommitMsg

[pack]
deltaCacheSize = 2047m
packSizeLimit = 2047m
windowMemory = 2047m

Most of the steps I've taken are based on advice from here: Github - unexpected disconnect while reading sideband packet

Is there anything I can do to get the mirror clone to work? Or anything I can do to further debug this issue?

Also the error implies this is a network issue, is that correct?

--

After great advice from @torek I ran

git fetch --deepen 1

but I get

13:54:22.746101 http.c:756              == Info: Connection #0 to host bitbucket.org left intact
13:54:22.746280 pkt-line.c:80           packet:          git< 0000
fetch-pack: unexpected disconnect while reading sideband packet
fatal: protocol error: bad pack header

Is this something I need to take up with bitbucket?

lewis
  • 2,936
  • 2
  • 37
  • 72

1 Answers1

4

I think you're running into the "server side times out while client side is processing the received data" bug-ette. To get around it—this method may not work, but is worth a try—you can start with:

git clone --mirror --depth 1 <url>

If that succeeds, enter the clone and run:

git fetch --unshallow

This may fail; if so, try git fetch --deepen 50 to just get 50 more commits at a time. Raise or lower this number depending on success vs failure. Eventually, a final git fetch --unshallow should finish successfully and leave you with a full (not shallow) clone, on which you can run The BFG.

If you can't get around it this way, you'll need to wait for a Git version in which the bug is fixed. A short description of the bug follows.

What's going wrong

Cloning and fetching both consist of running git fetch. The git clone command has the fetch step built into it, so that you don't have to run it separately, but both do the same thing:

  • They make a connection to some other Git software and ask that Git software to connect, on its side—in this case, on Bitbucket—to some repository. This is "their Git": their software, reading from their repository.

  • They now have their Git spill out the names of branches and other names of interest. Each of these names provides one commit hash ID.

  • Their Git now waits for your Git to request one or more of each of these hash IDs, perhaps along with a request for some or all of the earlier commits that make up history. (Remember that any Git repository is a collection of commits, which are the history in the repository, plus these names that help you and Git find the commits.)

  • Once your Git software and their Git software have agreed as to which commits are to be sent, their Git packages up these commits and supporting objects. This is where you see things like:

    remote: Compressing objects: 100% (2679/2679), done.
    

    These are messages coming from their Git, reporting on the progress of them packaging up the commits and other objects (you can see them reported as "sideband" messages above). They then send your Git this "pack".

  • Your Git is now done talking with their Git, as they are not really going to send anything else. However, your Git software fails to close down the connection at this point. Your Git now begins analyzing the pack, checking that there are no errors in it, building an index, and so on. The time needed for this depends on the speed of your computer and the size of the pack.

  • While your Git is busy checking the validity of the pack file it received and creating an index for it, the other Git gets impatient with the lack of activity, and closes the connection.

  • Your Git takes the closed connection as an error, removes the pack and partial index file, and reports the fetch as having failed.

If the fetch is being run by git clone, the clone operation removes the entire repository. If the fetch is being run separately (as in git fetch --deepen or git fetch --unshallow), it only removes the "failed" pack and associated files. Of course, nothing has necessarily failed here, it's just that your Git saw their Git disconnect and thinks something went wrong.

torek
  • 448,244
  • 59
  • 642
  • 775
  • 1
    what a great answer, thank you! people like you make SO an awesome place to be. I updated my question, I couldn't get it to work with even `--deepen 1` – lewis Sep 27 '21 at 12:50
  • Yeah, in this case you probably need a faster client machine, or a fix to Git itself. You could try cloning with ssh instead of https, in case the timeouts there are different (longer, one might hope :-) ). – torek Sep 27 '21 at 20:40
  • in the end, cloning over ssh made the difference and I got it working. thanks again. – lewis Oct 12 '21 at 15:49
  • 1
    Incidentally, a recent commit to the development version of Git does the connection-closing earlier. This caused a few other minor problems to crop up, but they're being fixed, or are fixed, now, and the *next* release should have this fixed overall. – torek Oct 12 '21 at 23:35