I'm trying to build a continuous integration system. Each push to GitHub will trigger a build.
Each build will need to checkout/download the repository for the commit it's processing. I'm trying to find a way to do that that would not take minutes on large repositories (because the build takes a few seconds only…).
Please note that I do not want to store data between builds (that removes the possibility of caching).
The solutions I've explored:
git clone
followed by a checkout of the commit: works but takes minutes for large repositories- git 2.5 supposedly introduced a way to fetch a single commit but I cannot get it to work with GitHub, my guess is that they are not using git 2.5 (Edit: doesn't work with GitHub indeed)
- use the GitHub API for git data but I cannot figure out if I can somehow download all files at a revision, and do that efficiently (i.e. avoid a single HTTP request per file) (Edit: it seems GitHub allows to download files as "tree" - not sure what it means - but for large repositories HTTP responses are truncated and they encourage to simply use git… back to square one)
Every other solution I see on GitHub assumes either a recent git version on the server, or that it's fine to clone the repository once but in my case it's not. I'm starting from scratch on every build (because that's a constraint).
So I'm asking in the specific case of GitHub: how can I download (in any way) the code at a specific commit to be able to run continuous integration tools on that commit?