Objective
I have a remote GitHub repositories, which uses git-lfs to hold large binary files.
- I want others to be able to quickly download my code and data.
- If speed can be enhanced, I don't expect others to necessarily version control their copies of the repository with git.
- Preferably, I want to know the reason of being slow or being fast.
Baseline approach (git lfs clone
)
As a test of how others will download my repository, I ran the following command on a high performance login node (with 72 Intel Xeon CPUs) on a Linux cluster, using a gpfs disk, and with these versions of git and git-lfs.
- git version 2.10.2
- git-lfs/2.3.4 (GitHub; linux amd64; go 1.9.1; git d2f6752f)
$ time git lfs clone --progress git@github.com:PackardChan/chk2019-blocking-extreme.git
Cloning into 'chk2019-blocking-extreme'...
remote: Enumerating objects: 138, done.
remote: Counting objects: 100% (138/138), done.
remote: Compressing objects: 100% (114/114), done.
remote: Total 138 (delta 20), reused 138 (delta 20), pack-reused 0
Receiving objects: 100% (138/138), 148.16 MiB | 36.59 MiB/s, done.
Resolving deltas: 100% (20/20), done.
Git LFS: (64 of 64 files) 7.29 GB / 7.29 GB
real 4m51.156s
user 7m14.044s
sys 0m28.360s
This took near 5 minutes even in a high performance node. And I noticed that the last line of output reaches the total of 7.29GB only in 36 seconds. The rest of the time is running git update-index -q --refresh --stdin
(from what I learn from top -c
command).
I therefore believe the performance can be substantially improved if update-index can be skipped. As mentioned in "Objectives", if speed can be improved, I don't mind giving up git version control.
Other unsuccessful attempts
- svn export
Inspired by this post, I tried:
time svn export https://github.com/PackardChan/chk2019-blocking-extreme/trunk z4svn
But the lfs files are not correctly downloaded. This is also reported here.
- git archive
However, GitHub doesn't support git-archive.
- --depth=1
I tried, it didn't perform better. This is understandable as my repository only has one commit.
I am rather new to git. So, am I missing anything?