7

I am trying to clone the tensorflow/models repo. I am connected to the remote machine with ssh. I tried many suggestions out there for fixing the issue but none worked for me.

git clone --recursive https://github.com/tensorflow/models.git
Cloning into 'models'...
remote: Counting objects: 1670, done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 1670 (delta 10), reused 0 (delta 0), pack-reused 1642
Receiving objects: 100% (1670/1670), 49.23 MiB | 8.44 MiB/s, done.
Resolving deltas: 100% (670/670), done.
fatal: fsync error on '/home/OFFICE/utk/projects/syntaxnet/models/.git/objects/pack/tmp_pack_2w67RB': Input/output error
fatal: index-pack failed
Utkrist Adhikari
  • 1,052
  • 3
  • 15
  • 42

3 Answers3

10

The problem was that I was trying to clone in the nfs file system. The solution is to clone the repo in non-nfs location, and then move the folder to the desired nfs location.

cd /tmp   (non nfs location)
git clone blablabla.git
mv blablabla ~
Utkrist Adhikari
  • 1,052
  • 3
  • 15
  • 42
4

Short answer: Use "eatmydata" (thats a program, check "apt install eatmydata")

Long answer: Git calls the "fsync()" system call frequently, to make sure the repository is consistent. This is important especially when multiple people use the same repository concurrently and also to make sure the repository is in a defined state should for example power get interrupted. After a pack file is written, it is forced to be synced (aka finished writing to the actual disk and not still in buffers) before metadata is updated.

Some filesystems - especially remote filesystems like NFS, sshfs, ... do not support fsync() but git has no flag to disable these calls.

What can help - under linux - is a wrapper called "eatmydata". Any program called through the wrapper will have its fsync() calls simulated without actually syncing. While this increases the risk of repository damage, should the write not actually go through, this is often acceptable when manually supervising the process.

just install eatmydata, then call

eatmydata git clone --recursive https://github.com/tensorflow/models.git
sync
1

If no apparent solution allows for cloning directly on the remote machine, try instead to:

  • clone the GitHub repo locally
  • make a bundle

     cd /path/to/my/repo
     git bundle create /tmp/myrepo.bundle --all
    
  • copy that one file (myrepo.bundle) to the remote machine over ssh

  • clone it from the bundle on the remote machine:

    git clone myrepo.bundle myrepo
    
Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • I tired what you said but I am still getting this error git clone myrepo.bundle models Cloning into 'models'... fatal: write error: Bad file descriptor error: index-pack died Checking connectivity... fatal: bad object 4bcbe8e140a6c83c151b93567f33e60e486922bd fatal: remote did not send all necessary objects – Utkrist Adhikari Dec 01 '16 at 10:38
  • @UtkristAdhikari that should only happen if your clone was shallow to begin with. On a full clone, the bundle (of a complete repo: http://stackoverflow.com/a/11795549/6309) should work. – VonC Dec 01 '16 at 21:55