2

I am trying to understand how (and if it is possible) to copy only specific files of a specific directory from remote in Git. (not interested in continuing working on those files or getting the history)

For example, say the remote master branch holds (among many others) a directory with the name \src into which there are .cpp and .h files. How would it be possible to only get a copy of all the header .h files?

Of course, I am looking if an approach exists that would not lead to re-writing the repo or any other undesired side effects. Simply, getting a local copy of only some specified files from the remote.

I have considered git archive and sparse-checkout but could not understand if I can use them to achieve my goal.

Yannis
  • 1,682
  • 7
  • 27
  • 45
  • I read other answers where people clearly state that the sparse-checkout cannot be used for this (http://stackoverflow.com/a/14527198/3286832) – Yannis Mar 17 '17 at 11:58
  • 1
    Git does not store change-sets (that's subversion), but commits (which represent the complete state of all directories and files at a point of time. – AnoE Mar 17 '17 at 12:15
  • @AnoE So, is it possible then to clone only specific files from remote in Git? – Yannis Mar 17 '17 at 12:18
  • @Yannis, no, that part of the comment was fine. ;) I have added an answer. – AnoE Mar 17 '17 at 12:20
  • Do you need to only grab a copy of the files or did you want to continue working on them and commit and push back into the repository? – Lasse V. Karlsen Mar 17 '17 at 12:23
  • @LasseV.Karlsen Just grab a copy (for example, of the header files). Not interested in the history, just the actual subset of files from a given directory. – Yannis Mar 17 '17 at 12:29
  • Updated my question and title to be more precise on my goal. – Yannis Mar 17 '17 at 12:58

1 Answers1

0

There are two kinds of subsets you can specifiy during git clone:

  1. -b branchname --single-branch allows you to only get those parts of the history that directly lead up to branchname. It will skip everything not necessary to fully describe branchname.
  2. --depth n allows you to truncate history beyond a certain depth.

Aside from that, clone (as well as commit, push, pull, merge) and so on always by design process the whole directory tree starting at the root. The commit (object) is the item they work with, they do not know about individual files (as opposed to CVS or SubVersion, for example, which can and regularly do work on individual files).

There are some ways to work with specific files/directories, but those are rather lowlevel and probably not what you want. For example, you can use the git protocol to fetch individual git objects (commits, trees, blobs...) directly... but I feel that's not what you are asking for.

Update: if you want to just grab the file and ignore the history, then you can use --depth 1 (to get the whole directory tree, but just for a single commit), grab your files, and be done with it. You still would need to download a lot more than you need, but at least it will be magnitudes less than the whole history (for something large like the Linux kernel).

Update 2: I am pretty sure that, except for doing your own implementation of the git networking protocol, there is no way to fetch single bits and pieces from a remote with the git commands. If you talk the protocol yourself, it's (relatively) easy of course. http://git-scm.com/book/en/v2/Git-Internals-Transfer-Protocols describes it, and it does not seem outlandish hard to use it yourself.

AnoE
  • 8,048
  • 1
  • 21
  • 36
  • Wondering if your answer applies for cases where we only want to copy a file(s) from remote to your local with no history or tracking, do you know? –  Mar 18 '17 at 09:43