4

I have a zip file containing data in a private Github repo. I am writing some R code, and I wish to download the data into R. When I run the following code I get a HTTP status was '404 Not Found'. This doesn't happen when the repo is bublic, though.

How can I fix the code to be able to download the file? I believe I need to get a access token, but I don't know well how to get that either and add it to the code. If someone could help, I would appreciate a lot. Thanks.

# We give the url a name
url <- "https://github.com/aquijanoruiz/disability_benefits_EC/raw/master/ENSANUT_datasets/BDD_ENSANUT_2018_STATA_.zip"
# We create a temporary directory
td <- tempdir()
# We create the placeholder file
tf <- tempfile(tmpdir=td, fileext = ".zip")
# We download the data into the placeholder file
download.file(url,tf)

1 Answers1

4

From r 3.6.2 download.file, you would need at least to add a header, with Authorization: token xxx in it, in order to be properly authenticated (thus having the right to read the repository content).
Use a PAT (Personnal Access Token)

Adding a media type is also recommended, as seen in "How can I download a single raw file from a private GitHub repo using the command line?", but probably not needed in your case, since you are using the /raw/ URL directly.

I would use remote.download, which wraps download.file, and allows to easily set an auth_header

download(".", url, "<yourToken>")

The OP Alonso Quijano, however, had to resort to a workaround (from the comments):

I Uploaded the file to my google drive, where a token is not needed, just the url.


As noted by Kai Aragaki in the comments:

The download() function from remotes isn't exported, so you must access it with remotes:::download() (note the three colons instead of two)

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • Thanks, could you please tell which scopes or permissions should I select just to allow download of the document? – Alonso Quijano Nov 23 '21 at 15:47
  • 1
    @AlonsoQuijano A token with scope repo (https://docs.github.com/en/developers/apps/building-oauth-apps/scopes-for-oauth-apps) should be enough. Assuming you have the right to access that private repository (either because you are the owner, or a registered collaborator to said private repository) – VonC Nov 23 '21 at 15:57
  • Thank again. I am trying to use the download function as you suggest. I loaded the `remotes` package, but it doesn't seem to find the function. I used this code: library(remotes) download(path = tf, url = url, auth_token = "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") I also tried adding a header to the download.file function. download.file(tf, url, headers = "Authorization: token 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"). I have left the token for testing. I can eliminate afterward. – Alonso Quijano Nov 23 '21 at 16:38
  • @AlonsoQuijano it should be `dowload.file(path = ...)` – VonC Nov 23 '21 at 16:57
  • Hi, `download.file` does not seem to have a `path` argument. – Alonso Quijano Nov 23 '21 at 17:18
  • @AlonsoQuijano I cannot test it at the moment, so all I can see is https://rdrr.io/cran/remotes/man/download.html, which seems to list path as an argument. – VonC Nov 23 '21 at 19:00
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/239505/discussion-between-alonso-quijano-and-vonc). – Alonso Quijano Nov 23 '21 at 19:05
  • For `remotes::download` I now get an error: `Error: 'download' is not an exported object from 'namespace:remotes'` – Mark Neal Apr 06 '22 at 21:51
  • 1
    @MarkNeal The `download()` function from `remotes` isn't exported, so you must access it with `remotes:::download()` (note the three colons instead of two) – Kai Aragaki May 08 '22 at 18:02
  • @KaiAragaki interesting, this is my first exposure to the triple colon. Documentation had this to say: It is typically a design mistake to use ::: in your code since the corresponding object has probably been kept internal for a good reason. – Mark Neal May 08 '22 at 21:10