0

I am trying to download files from an online repository, mostly PDFs.

However, I only want to download the files of a specific commit. The total archive is over 1400 files, and the latest commit adds roughly 300 files to the total archive.

How do I clone only the newly uploaded 300 files from the repository?

Unlike other similar questions I have come across relating to downloading a single file, I would like to download the entire commit, which is over 300 files. For reference, the repo is here:

https://github.com/KingOfCramers/sidtoday

... and the commit of the new files that I would like to download (to my local computer) is here:

https://github.com/KingOfCramers/sidtoday/commit/07b7008f215ffe784068d9d2d14fb5d76875ca24

Harrison Cramer
  • 3,792
  • 9
  • 33
  • 61
  • Possible duplicate of [How to checkout only one file from git repository ('sparse checkout')?](https://stackoverflow.com/questions/2466735/how-to-checkout-only-one-file-from-git-repository-sparse-checkout) – Johnny Willer Aug 05 '18 at 16:23
  • Hello Johnny, I don't think that's what I'm looking for. I'm trying to copy many files from an entire commit, not just download one file. – Harrison Cramer Aug 05 '18 at 16:32
  • 1
    The git unit of commerce is the commit. If you want content to be downloadable as a unit, make a commit of that content. – jthill Aug 05 '18 at 17:25
  • I've forked this entire repository from another user, otherwise I wouldn't have this issue. I'm confused what you mean––is there really no way to simply clone/download the files uploaded by an individual commit? I feel like it shouldn't be difficult to acquire the files displayed on the screen here.... https://github.com/KingOfCramers/sidtoday/commit/07b7008f215ffe784068d9d2d14fb5d76875ca24 – Harrison Cramer Aug 05 '18 at 17:49
  • There's a difference between content aka snapshots aka commits and analysis/construction of differences between commits. A web browser's showing you which files are different between a commit and its parent, that's useful. – jthill Aug 05 '18 at 18:41
  • @HarryCramer you can pass a commit hash to download the entire commit... – Johnny Willer Aug 06 '18 at 16:48

1 Answers1

2

is there really no way to simply clone/download the files uploaded by an individual commit?

Yes, there really is no way to simply clone or download the files updated in an individual commit.

At lesat, there's no in-Git way. You can use Git as a tool to build whatever you like, if you control the server. If the server is GitHub, well, see the last paragraph.

The root of this problem is that a commit does not contain only changed files, nor does it containly changes: each commit is a complete snapshot of all files. Hence, to find out what changed, you must start with two snapshots. Think of this as one of those Spot the Difference games: it does you no good to get one picture, you must get both.

As a whole, Git is designed to deliver all the snapshots. Those are commits; commits are what is in a repository; so those are what you get. If you want a different result, Git-on-the-server has all the snapshots, and can do the comparisons, and you can use this to write your own software that does whatever it is you would like done, but you will need to control the software on the Git server. Fortunately, you can clone the entire repository onto your client, and then your client is a perfectly good server.

Note that once you do have a clone, git fetch into that clone uses a protocol that attempts to minimize network traffic, by having the two Gits compare notes. The server then prepares a so-called thin pack that contain deltas from objects that you already have, wherever that's feasible, so that you actually get just the incremental changes! But for this to work, you must have an existing clone.

Be aware, too, that if your server is specifically GitHub, GitHub offers a REST API (well, potentially multiple APIs: the current one is version 3), and you can use that API to compare commits and to download files. See in particular https://developer.github.com/v3/git/trees/ about obtaining trees (the snapshot within each commit is a tree). Note that there are length limits that, if exceeded, will force you to clone anyway.

torek
  • 448,244
  • 59
  • 642
  • 775
  • While in fact is impossible to download only changed files, you can download some files from some commit with `git archive` as pointed out in my comment. It's possible to download the entire commit too. – Johnny Willer Aug 06 '18 at 16:55
  • @JohnnyWiller: Yes, depending on the server, you can often download an archive (from `git archive`) constructed from a specific commit if it has a branch or tag name, or even an arbitrary commit by hash ID. You can also do a `--single-branch` clone with `--depth 1` for the same case. Because `git archive` builds the archive on the fly, these can be a bit different. – torek Aug 06 '18 at 16:58