283

What is the most efficient mechanism (in respect to data transferred and disk space used) to get the contents of a single file from a remote git repository?

So far I've managed to come up with:

git clone --no-checkout --depth 1 git@github.com:foo/bar.git && cd bar && git show HEAD:path/to/file.txt

This still seems overkill.

What about getting multiple files from the repo?

dreftymac
  • 31,404
  • 26
  • 119
  • 182
  • 3
    Aaw. I would love it if there was a built in way to do the equivalent of "cat-remote" and "tag-remote". – conny Jul 07 '10 at 11:27
  • 3
    I have this same problem I want to have the same license file in 2 repos; edit the file in 1 repo then have it auto update the copy in the other repo. – GlassGhost Jan 26 '11 at 11:04
  • Possible duplicate of [How to checkout only one file from git repository?](http://stackoverflow.com/questions/2466735/how-to-checkout-only-one-file-from-git-repository) – Ciro Santilli OurBigBook.com Mar 30 '16 at 10:48

24 Answers24

179

In git version 1.7.9.5 this seems to work to export a single file from a remote

git archive --remote=ssh://host/pathto/repo.git HEAD README.md | tar xO

This will cat the contents of the file README.md.

x-yuri
  • 16,722
  • 15
  • 114
  • 161
Yisrael Dov
  • 2,369
  • 1
  • 16
  • 13
  • 38
    ... Except it doesn't work on GitHub. Dang. :( https://twitter.com/GitHubHelp/status/322818593748303873 – Rob Howard Sep 26 '13 at 14:20
  • 17
    This doesn't seem to yield the raw file but rather a tar file with just a single file. – Frerich Raabe Mar 10 '14 at 19:58
  • 22
    @FrerichRaabe just add ` | tar -x` to the command. `git archive --remote=ssh://host/pathto/repo.git HEAD README.md | tar -x` `cat README.md` – renier May 21 '15 at 18:46
  • 14
    You can use `tar -xO` to output to STDOUT for piping, e.g. `FILE=README.md && git archive --remote=ssh://host/pathto/repo.git HEAD "$FILE" | tar -xO "$FILE"` – paulcm Aug 25 '15 at 11:56
  • 4
    Exactly the answer I was looking for, but I get "fatal: Operation not supported by protocol." in response from Git. Argh. – mhvelplund Feb 18 '16 at 08:06
  • This apparently works but I've an issue with it: the md5sum of what I get back after untaring does not match with what I have on the git repo this is what I get back: md5sum README.md f8359c59f8ebc61adfcef6c19aea0a97 README.md this is what's on the git repo: md5sum README.md 2ff7094d2cb8d574b4ca04a95ceff534 README.md – louigi600 Aug 19 '16 at 07:23
  • Will export a header and a trailer with lots of nulls – fcm Oct 23 '17 at 20:45
  • 1
    This answer is a good start but the problem it doesn't make clear is that the concatenated contents of `README.md` will include a bunch of header junk, since its actually a tar file, not just the plain text. In the end I had to use `git archive --remote ${remoteGitRepo} HEAD ${sourceFilePath} | tar --extract --directory ${tempDir}` and then read the file at `${tempDir}/${sourceFilePath}` in order to get a clean version of the source file text, which is pretty much what the OP's original solution was doing. – Aaron Beall Oct 27 '17 at 20:12
  • 1
    I get the error: __fatal: Operation not supported by protocol.__ when I attempt to use it on a repo hosted on GitHub. If this method doesn't work on the protocol used by GitHub, then it can be safely regarded as "unreliable". – Sophia_ES Nov 19 '17 at 18:16
  • is it possible to specify a branch? – arod May 13 '18 at 20:28
  • @RobHoward For GitHub, see: https://stackoverflow.com/questions/9609835/git-export-from-github-remote-repository – Wolfgang Oct 22 '18 at 15:09
  • 1
    @arod Yes, just specify the branch name in place of `HEAD`. – Wolfgang Oct 22 '18 at 15:10
  • This does not work for a sha commit hash. From the [docs](https://git-scm.com/docs/git-upload-archive): "neither a relative commit like master nor a literal sha1 like abcd1234 is allowed" – tar Jun 09 '20 at 16:57
  • If you're on Windows and don't have `tar`, you can use `powershell -command "Expand-Archive -Force` – cowlinator Jul 09 '22 at 00:04
  • 2
    Just for the distracted people, that's the letter O and not the number zero: `-xO`. It makes the output be stdout instead of saving the file in the current directory. – txulu Sep 19 '22 at 08:37
84

Following on from Jakub's answer. git archive produces a tar or zip archive, so you need to pipe the output through tar to get the file content:

git archive --remote=git://git.foo.com/project.git HEAD:path/to/directory filename | tar -x

Will save a copy of 'filename' from the HEAD of the remote repository in the current directory.

The :path/to/directory part is optional. If excluded, the fetched file will be saved to <current working dir>/path/to/directory/filename

In addition, if you want to enable use of git archive --remote on Git repositories hosted by git-daemon, you need to enable the daemon.uploadarch config option. See https://kernel.org/pub/software/scm/git/docs/git-daemon.html

DavidRR
  • 18,291
  • 25
  • 109
  • 191
Robert Knight
  • 2,888
  • 27
  • 21
  • 4
    If it is a text file and we want to save it to another part it is gut to use: | tar -xO > ~/destfile.ext – yucer Jul 15 '15 at 10:52
  • does it work with specific commit? (i.e. one specifies both specific file and commit) – Alleo Nov 11 '21 at 23:26
  • 1
    Yes. Replace `HEAD` with the commit ID that you want to use. `HEAD` is an alias that refers to either the currently checked out commit (if applicable) or the tip of the default branch. I wrote the above answer years ago and learned this morning that GitHub doesn't support `git archive`, so that makes it a lot less useful. – Robert Knight Nov 13 '21 at 06:50
  • 1
    Looks like the best answer to me. Add a `v` as another option to `tar -x` doesn't hut. Also it may be good to note that it works **also for a specific folder**, not only a single file : `git archive --remote=git://git.foo.com/project.git HEAD path/to/folder/ | tar -xv` – M-Jack May 11 '22 at 14:15
  • **fatal: operation not supported by protocol** – gerrit Jul 06 '23 at 09:49
47

If there is web interface deployed (like gitweb, cgit, Gitorious, ginatra), you can use it to download single file ('raw' or 'plain' view).

If other side enabled it, you can use git archive's '--remote=<URL>' option (and possibly limit it to a directory given file resides in), for example:

$ git archive --remote=git@github.com:foo/bar.git --prefix=path/to/ HEAD:path/to/ |  tar xvf -
Jakub Narębski
  • 309,089
  • 65
  • 217
  • 230
  • Note: the example was not tested! – Jakub Narębski Jul 14 '09 at 15:51
  • 7
    For your own repositories you need to specifically enable upload-archive if using git-daemon (git:// style urls) with `git config daemon.uploadarch true` on the remote repository. By default git daemon disables remote archive with "fatal: remote error: access denied or repository not exported: ..." – patthoyts Feb 07 '13 at 09:34
  • +1 The `git archive` approach was my first try - but then I noticed that requiring `tar` on the client machine wasn't exactly convenient for Windows users. We ended up fetching from our local `cgit` server. It works, but it's not as fast as I'd like it to be (and it still requires running `unix2dos` or similiar on Windows machines since we store files with Unix line endings in the Git repository). – Frerich Raabe Mar 10 '14 at 19:47
  • Is there a GUI that can browse the remote git and where you can set off this `git archive...` command automatically in the background to see single files directly inside the gui? – rubo77 Mar 20 '14 at 23:27
  • 1
    @FrerichRaabe Use -o fetched.zip. Also see --format= option. – akhan May 06 '14 at 19:47
  • 5
    For what it's worth, it doesn't look like this works on GitHub hosted repositories. See https://help.github.com/articles/can-i-archive-a-repository and https://groups.google.com/forum/#!topic/github/z8vLHcX0HxY – vmrob Aug 20 '14 at 22:19
  • Append `|tar -xO` to write the file to stdout. – neuhaus Jan 19 '18 at 10:39
44

Not in general but if you are using Github:

For me wget to the raw url turned out to be the best and easiest way to download one particular file.

Open the file in the browser and click on "Raw" button. Now refresh your browser, copy the url and do a wget or curl on it.

wget example:

wget 'https://github.abc.abc.com/raw/abc/folder1/master/folder2/myfile.py?token=DDDDnkl92Kw8829jhXXoxBaVJIYW-h7zks5Vy9I-wA%3D%3D' -O myfile.py

Curl example:

curl 'https://example.com/raw.txt' > savedFile.txt
Kermit
  • 4,922
  • 4
  • 42
  • 74
Ankur Agarwal
  • 23,692
  • 41
  • 137
  • 208
  • 5
    This is the easiest solution, and works for any raw txt one could find. `curl https://example.com/raw.txt > savedFile.txt` – JacobPariseau Jan 05 '17 at 23:06
  • wget example doesn't work, curl example does though. – Kyle Baker Feb 09 '17 at 21:56
  • Works just fine for me. Did you put your url in quotes on the commandline ? – Ankur Agarwal Feb 10 '17 at 21:24
  • this does not preserve git history – crypdick Jun 30 '18 at 02:34
  • 2
    The solution is asked for Git, the answer endorses Github is git and is nowheere related to git. It's based on additional APIs offered by a prominent git solution provider! – Ravinder Payal Mar 22 '21 at 18:24
  • If u need a raw file content from github, just run curl with this url e.g.: `curl 'https://raw.githubusercontent.com/SoliDry/api-generator/master/tests/functional/oas/openapi.yaml' > openapi_test.yaml` It is easily accessible from UI btn `raw` btw. – Arthur Kushman May 14 '21 at 16:08
17

To export a single file from a remote:

git archive --remote=ssh://host/pathto/repo.git HEAD README.md | tar -x

This will download the file README.md to your current directory.

If you want the contents of the file exported to STDOUT:

git archive --remote=ssh://host/pathto/repo.git HEAD README.md | tar -xO

You can provide multiple paths at the end of the command.

Kousha
  • 1,575
  • 14
  • 18
8

I solved in this way:

git archive --remote=ssh://git@gitlab.com/user/mi-repo.git BranchName /path-to-file/file_name | tar -xO /path-to-file/file_name > /path-to-save-the-file/file_name

If you want, you could replace "BranchName" for "HEAD"

matiasmasca
  • 605
  • 8
  • 14
8

If no other answer worked (i.e. restrictive GitLab access), you can do a "selective-checkout" by:

  1. git clone --no-checkout --depth=1 --no-tags URL
  2. git restore --staged DIR-OR-FILE
  3. git checkout DIR-OR-FILE

Although this solution is 100% git compliant and you can checkout a directory, it's not disk nor network optimal as doing a wget/curl on a file.

ATorras
  • 4,073
  • 2
  • 32
  • 39
8

It looks like a solution to me: http://gitready.com/intermediate/2009/02/27/get-a-file-from-a-specific-revision.html

git show HEAD~4:index.html > local_file

where 4 means four revision from now and ~ is a tilde as mentioned in the comment.

Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
Mars Robertson
  • 12,673
  • 11
  • 68
  • 89
  • Make sure to notice that is it NOT the 'minus sign' '-' between 'HEAD' and '4', but the 'tilde' '~'. Apparently I haven't read the git docs well enough, or my glasses need updating ;-) – Dennis Jul 04 '12 at 16:04
  • 23
    This doesn't seem to get the file from a remote repository though, like the OP needs. – Mike Weller Jan 18 '13 at 13:11
  • Or: ```git show HEAD:./my_other_file > local_file``` if the file isn't in your root dir:) – kenorb Jun 21 '13 at 11:52
  • Or: ```git show refs/remotes/my_remote/master:./my_file``` where refs path is your proper remote path retrieved by: ```git show-ref``` – kenorb Jun 21 '13 at 12:01
  • 1
    Kind request for all downvoters - please explain and clarify what's not OK - we are here to learn and share :) – Mars Robertson Jul 26 '13 at 10:05
  • 10
    @MichalStefanow: Mike Weller has it; specifically, this doesn't work on a remote repository. You need a local clone at the very least, even if you then have remotes set up on it. – Rob Howard Sep 26 '13 at 14:08
7

A nuanced variant of some of the answers here that answers the OP's question:

git archive --remote=git@archive-accepting-git-server.com:foo/bar.git \
  HEAD path/to/file.txt | tar -xO path/to/file.txt > file.txt
Willem van Ketwich
  • 5,666
  • 7
  • 49
  • 57
6

I use this

$ cat ~/.wgetrc
check_certificate = off

$ wget https://raw.github.com/jquery/jquery/master/grunt.js
HTTP request sent, awaiting response... 200 OK
Length: 11339 (11K) [text/plain]
Saving to: `grunt.js'
Zombo
  • 1
  • 62
  • 391
  • 407
  • Works for me even without wgetrc tweaks: `wget https://raw.github.com/bk322/bk_automates/master/bkubuntu/bkubuntu.bash` – Adobe Sep 03 '12 at 07:54
  • 1
    My message is more helpful: `ERROR: Certificate verification error for raw.github.com: unable to get local issuer certificate.` **`To connect to raw.github.com insecurely, use '--no-check-certificate'.`** – Kos Jan 18 '13 at 13:42
  • 4
    This works for public repositories only. For private repositories you need authentication. – rikas Jan 05 '15 at 18:27
  • Mac didn't have wget, so I used curl but I had to use `curl -H 'Cache-Control: no-cache, no-store' https://raw.githubusercontent.com/org/repo/master/file > outfile` otherwise it does not download if the file has already been downloaded – Arundale Ramanathan Jun 05 '22 at 13:15
5

It seems to me the easiest way to use the following:

wget https://github.com/name/folder/file.zip?raw=true
Richard Lalancette
  • 2,401
  • 24
  • 29
ihsinme
  • 65
  • 1
  • 3
  • 2
    Thank you, simple indeed. To get rid of '?raw=true' at the end of saved file one can use: ```-O your-file-name``` at the end of the command above. – timanix Feb 23 '22 at 08:21
4

If you repository supports tokens (for example GitLab) then generate a token for your user then navigate to the file you will download and click on RAW output to get the URL. To download the file use:

curl --silent --request GET --header 'PRIVATE-TOKEN: replace_with_your_token' \
'http://git.example.com/foo/bar.sql' --output /tmp/bar.sql
panticz
  • 2,135
  • 25
  • 16
4

This is specific for git repos hosted on GitHub

Try the 'api' command of Github's command line app, gh, to make an authenticated call to Github's 'get repository contents' endpoint.

The basic command is:

$gh api /repos/{owner}/{repo}/contents/<path_to_the_file>

As an added bonus, when you do this from inside a directory that contains a clone of the repo you're trying to get the file from, the {owner} and {repo} part will be automatically filled in.

https://docs.github.com/en/rest/reference/repos#get-repository-content

The response will be a JSON object. If the <path_to_the_file> indeed points to a file, the JSON will include a 'size', 'name', several url fields to access the file, as well as a 'content' field, which is a base64 encoded version of the file contents.

To get the file contents, you can curl the value of the "download_url", or just decode the 'content' field. You can do that by piping the base64 command, like this:

$gh api /repos/{owner}/{repo}/contents/<path-to-the-file> --jq '.content' | base64 -d
dontascii
  • 69
  • 4
3

For single file, just use wget command.

First, follow the pic below to click "raw" to get the url, otherwise you will download code embedded in html. enter image description here

Then, the browser will open a new page with url start with https://raw.githubusercontent.com/...

just enter the command in the terminal:

#wget https://raw.githubusercontent.com/...

A while the file will put in your folder.

malajisi
  • 2,165
  • 1
  • 22
  • 18
  • Yep, this also works nice within Python, or other programming languages, with a REST-funcionality. For example for downloading modules from different repositories.. – Lars GJ Oct 10 '18 at 13:46
3

If your Git repository hosted on Azure-DevOps (VSTS) you can retrieve a single file with Rest API.

The format of this API looks like this:

 https://dev.azure.com/{organization}/_apis/git/repositories/{repositoryId}/items?path={pathToFile}&api-version=4.1?download=true

For example:

 https://dev.azure.com/{organization}/_apis/git/repositories/278d5cd2-584d-4b63-824a-2ba458937249/items?scopePath=/MyWebSite/MyWebSite/Views/Home/_Home.cshtml&download=true&api-version=4.1
Shayki Abramczyk
  • 36,824
  • 16
  • 89
  • 114
3

The following 2 commands worked for me:

git archive --remote={remote_repo_git_url} {branch} {file_to_download} -o {tar_out_file}

Downloads file_to_download as tar archive from branch of remote repository whose url is remote_repo_git_url and stores it in tar_out_file

tar -x -f {tar_out_file}.tar extracts the file_to_download from tar_out_file

rok
  • 9,403
  • 17
  • 70
  • 126
1

I use curl, it works with public repos or those using https basic authentication via a web interface.

curl -L --retry 20 --retry-delay 2 -O https://github.com/ACCOUNT/REPO/raw/master/PATH/TO/FILE/FILE.TXT -u USER:PASSWORD

I've tested it on github and bitbucket, works on both.

JasonS
  • 7,443
  • 5
  • 41
  • 61
1

Yisrael Dov's answer is the straightforward one, but it doesn't allow compression. You can use --format=zip, but you can't directly unzip that with a pipe command like you can with tar, so you need to save it as a temporary file. Here's a script:

#!/bin/bash

BASENAME=$0

function usage {
    echo "usage: $BASENAME <remote-repo> <file> ..."
    exit 1
}

[ 2 -gt "$#" ] && { usage; }

REPO=$1
shift
FILES=$@

TMPFILE=`mktemp`.zip
git archive -9 --remote=$REPO HEAD $FILES -o $TMPFILE
unzip $TMPFILE
rm $TMPFILE

This works with directories too.

Community
  • 1
  • 1
naught101
  • 18,687
  • 19
  • 90
  • 138
1

Github Enterprise Solution

HTTPS_DOMAIN=https://git.your-company.com
ORGANISATION=org
REPO_NAME=my-amazing-library
FILE_PATH=path/to/some/file
BRANCH=develop
GITHUB_PERSONAL_ACCESS_TOKEN=<your-access-token>

URL="${HTTPS_DOMAIN}/raw/${ORGANISATION}/${REPO_NAME}/${BRANCH}/${FILE_PATH}"

curl -H "Authorization: token ${GITHUB_PERSONAL_ACCESS_TOKEN}" ${URL} > "${FILE_PATH}"
Oliver Pearmain
  • 19,885
  • 13
  • 86
  • 90
  • Where do we find the `GITHUB_PERSONAL_ACCESS_TOKEN`? – ShadSterling Mar 23 '20 at 18:38
  • 1
    You can create a personal access token by going to https:///settings/tokens and hitting "Generate new token" button. – Oliver Pearmain Mar 24 '20 at 19:51
  • Hmm, we have automations that are given a username and password, which are used to authenticate to multiple systems that use the same SSO, so I was hoping for a way to automate generating a token given a username & password. – ShadSterling Mar 25 '20 at 13:36
0

If you want to get a file from a specific hash + a remote repository I've tried git-archive and it didn't work.

You would have to use git clone and once the repository is cloned you would have then to use git-archive to make it work.

I post a question about how to do it more simpler in git archive from a specific hash from remote

Community
  • 1
  • 1
0

for bitbucket directly from browser (I used safari...) right-click on 'View Raw" and choose "Download Linked File":

enter image description here

ingconti
  • 10,876
  • 3
  • 61
  • 48
0

If you don't mind cloning the entire directory, this small bash/zsh function will have the end result of cloning a single file into your current directory (by cloning the repo into a temp directory and removing it afterwards).

Pro: You only get the file you want

Con: You still have to wait for the whole repo to clone

git-single-file () {
        if [ $# -lt 2 ]
        then
                echo "Usage: $0 <repo url> <file path>"
                return
        fi
        TEMP_DIR=$(mktemp -d)
        git clone $1 $TEMP_DIR
        cp $TEMP_DIR/$2 .
        rm -rf $TEMP_DIR
}
Christopher Shroba
  • 7,006
  • 8
  • 40
  • 68
0

If your goal is just to download the file there's a hassle-free application called gget:

gget github.com/gohugoio/hugo 'hugo_extended_*_Linux-ARM.deb'

The above example would download single file from hugo repository.

https://github.com/dpb587/gget

Granitosaurus
  • 20,530
  • 5
  • 57
  • 82
-1

Related to @Steven Penny's answer, I also use wget. Furthermore, to decide which file to send the output to I use -O .

If you are using gitlabs another possibility for the url is:

wget "https://git.labs.your-server/your-repo/raw/master/<path-to-file>" -O <output-file>

Unless you have the certificate or you access from a trusted server for the gitlabs installation you need --no-check-certificate as @Kos said. I prefer that rather than modifying .wgetrc but it depends on your needs.

If it is a big file you might consider using -c option with wget. To be able to continue downloading the file from where you left it if the previous intent failed in the middle.