74

This morning I made a shallow clone of the Linux sources

git clone --depth 1 https://github.com/torvalds/linux.git

which resulted in a linux folder of 851Mb.

Now I would like to pull the latest changes, but

git pull

starts a seemly huge download. After 60Mb I'm at 3% which extrapolates to 2Gb. However, the 5 commits since my clone change only a bunch of lines.

Am I doing something wrong? What are the 2Gb that git tries to download?

matec
  • 1,316
  • 1
  • 10
  • 22
  • Strange...I did the same commands and [verified the size](http://stackoverflow.com/a/8185326/1079354) of what I had downloaded to be about 135MB. I also did the same pull and nothing new was downloaded. Which version of Git are you using? I'm on 1.9.1 and am unable to reproduce your error. – Makoto Jun 02 '14 at 02:54
  • @Makoto My sentence about the downloaded size was missleading/wrong. I've edited it now. Probably the actual download in my case was 135Mb, too. I am also using git version 1.9.1. – matec Jun 02 '14 at 03:00
  • @Makoto If I clone the latest version, and then pull, I also just get `Already up-to-date`. So I suppose the problem only occurs if changes have been made to the remote repository since the clone. – matec Jun 02 '14 at 03:30

4 Answers4

84

I think you can use --depth 1 in git pull too, so it gets just what's needed for the newest commit in the repository.

I don't know if the default behaviour is to pull everything missing, because my git help pull shows this option:

git pull --unshallow

or

git fetch  --unshallow

--unshallow Convert a shallow repository to a complete one, removing all the limitations imposed by shallow repositories.

I'm running git version 1.8.5.2 (Apple Git-48), and maybe this is some sort-of-new behaviour, and changing a bit between versions.

Khaled Annajar
  • 15,542
  • 5
  • 34
  • 45
mgarciaisaia
  • 14,521
  • 8
  • 57
  • 81
  • I've tried `git pull --depth 1` now. (Before I checked `git status` and everything showed clean.) Then the pull tried to merge something but failed at some point. Now `git status` shows `Unmerged paths`. `git pull` and `git checkout` do not work any more. – matec Jun 02 '14 at 03:51
  • 4
    `git pull` / `git fetch` have `--unshallow` option – linquize Jul 03 '14 at 14:56
  • 3
    This always gets me. I do `git clone repo.git --depth=1`, then I forget about it, I `git log` and I'm amazed that there are too few commits. Then finally I remember that I've previously used `depth` option. `git pull --unshallow` saved my life all the time. Thanks! – Șerban Ghiță Mar 06 '17 at 10:28
  • 1
    If I try `pull --depth 1` on a linux kernel repository I eventually get "fatal: refusing to merge unrelated histories" and unshallow takes a lot of time to be useful. I end up just deleting the whole thing and cloning it again since it's faster. – j riv Jan 06 '20 at 07:31
  • `--unshallow` seems does not pull all branches, had to delete the repo and clone again without `--depth 1` option – angularrocks.com May 03 '22 at 07:23
7

Could any of the new commits be merge-commits pointing to commits not present in your tree? Perhaps --depth 1000 would work better and still be small enough.

Andreas Wederbrand
  • 38,065
  • 11
  • 68
  • 78
  • Most of them seem to be merge commits. I'll try if some depth helps. – matec Jun 02 '14 at 12:06
  • 3
    I do not fully understand how `--depth` works. Intuitively I would assume `--depth 1000` includes the last 1000 commits. But I tried `--depth 100` and end up with > 50000 commits reaching back to 2012. – matec Jun 02 '14 at 15:40
  • 1
    Finally a new commit (tack Linus!) so I could test it. It works: having cloned with `--depth 100` `git pull` works fine (950Mb from `clone`, same after `pull`), while `--depth 1` leads to significantly larger repository size after the `pull` – matec Jun 03 '14 at 03:47
  • That definitely explains why a linux kernel repo fails to pull with depth 1. Linus keeps merging from the trees of others. I don't fully get why more depth would help in that case but I guess it might. – j riv Jan 06 '20 at 07:43
2

Depth places commits in the .git/shallow file. When retrieving commit history, your request will stop at the commits in the shallow file, but if there were a merge into the current branch, it will follow that and the whole history behind it.

From my blog post, Exploring Git Clone --depth:

If you had a branch structure that you did a git clone --depth=1 when main was at c:

...  -  .  -  .  - [c] -  .  -  .  -  .  -  .  (main)
         \               /
           .  -  .  -  .  (xyz)

And then later did a fetch at g, the merge at d would cause you to pull nearly the whole history (except b).

1000’s of commits  -  a  -  .  - [c] -  d  -  e  -  f  -  g  (main)
                       \               /
                         x  -  y  -  z  (xyz)

My blog post gives 2 suggestions for this:

  1. Fetch with a specified depth and update your branch:
git fetch --depth 1 origin main
git reset --hard origin/main
  1. Clone with enough history that you probably won't get bypasses:
git clone --shallow-since=2022-06-01 repo
0

Just use

git pull --depth=50

or specify the depth you would like to download. That should be fine.

David Wong
  • 39
  • 1
  • 8