70

I am trying to read a CSV from github into R:

latent.growth.data <- read.csv("https://github.com/aronlindberg/latent_growth_classes/blob/master/LGC_data.csv")

However, this gives me:

Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") : unsupported URL scheme

I tried ?read.csv, ?download.file, getURL (which only returned strange HTML), as well as the data import manual, but still cannot understand how to make it work.

What am I doing wrong?

pnuts
  • 58,317
  • 11
  • 87
  • 139
histelheim
  • 4,938
  • 6
  • 33
  • 63

10 Answers10

115

Try this:

library(RCurl)
x <- getURL("https://raw.github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv")
y <- read.csv(text = x)

You have two problems:

  1. You're not linking to the "raw" text file, but Github's display version (visit the URL for https:\raw.github.com....csv to see the difference between the raw version and the display version).
  2. https is a problem for R in many cases, so you need to use a package like RCurl to get around it. In some cases (not with Github, though) you can simply replace https with http and things work out, so you can always try that out first, but I find using RCurl reliable and not too much extra typing.
dedmonds
  • 38
  • 2
  • 6
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • 6
    How do you resolve `Error in function (type, msg, asError = TRUE) : SSL certificate problem: unable to get local issuer certificate`? – Hack-R Jan 08 '15 at 17:54
  • 1
    Can also be written as one line for memory/space purposes: `y <- read.csv(text=getURL("https://raw.github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv"))` – bjoseph Mar 23 '15 at 14:53
  • I tried this but it did not work. ```x <- getURL("https://github.com/eparker12/nCoV_tracker/blob/master/input_data/coronavirus_today.csv") y <- read.csv(text = x)``` – Ben10 May 14 '20 at 19:00
  • @Ben10, you're not using the raw URL. Can you try with that and see if it works? – A5C1D2H2I1M1N2O1R2T1 May 15 '20 at 02:59
26

From the documentation of url:

Note that ‘https://’ connections are not supported (with some exceptions on Windows).

So the problem is that R does not allow conncetions to https URL's.

You can use download.file with curl:

download.file("https://raw.github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv", 
    destfile = "/tmp/test.csv", method = "curl")
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
23

I am using R 3.0.2 and this code does the job.

urlfile<-'https://raw.github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv'
dsin<-read.csv(urlfile)

and this as well

urlfile<-'https://raw.github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv'
dsin<-read.csv(url(urlfile))

edit (sessionInfo)

R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Polish_Poland.1250  LC_CTYPE=Polish_Poland.1250   
[3] LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C                  
[5] LC_TIME=Polish_Poland.1250    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.0.2
Maciej
  • 3,255
  • 1
  • 28
  • 43
14

In similar style to akhmed, I thought I would update the answer, since now you can just use Hadley's readr package. Just one thing to note: you'll need the url to be the raw content (see the //raw.git... below). Here's an example:

library(readr)
data <- read_csv("https://raw.githubusercontent.com/RobertMyles/Bayesian-Ideal-Point-IRT-Models/master/Senate_Example.csv")

Voilà!

RobertMyles
  • 2,673
  • 3
  • 30
  • 45
8

Realizing that the question is very old, Google still reported it as a top result (at least for me) so I decided to provide the answer for year 2015.

Folks are generally migrating now to curl package (including famous httr) as described by r-bloggers which offers the following very simple solution:

library(curl)

x <- read.csv( curl("https://raw.githubusercontent.com/trinker/dummy/master/data/gcircles.csv") )
akhmed
  • 3,536
  • 2
  • 25
  • 35
4

This is what I've been helping develop rio for. It's basically a universal data import/export package that supports HTTPS/SSL and infers the file type from its extension, thus allowing you to read basically anything using one import function:

library("rio")

If you grab the "raw" url for your CSV from Github, you can load it one line with import:

import("https://raw.githubusercontent.com/aronlindberg/latent_growth_classes/master/LGC_data.csv")

The result is a data.frame:

     top100_repository_name   month monthly_increase monthly_begin_at monthly_end_with
1                    Bukkit 2012-03                9              431              440
2                    Bukkit 2012-04               19              438              457
3                    Bukkit 2012-05               19              455              474
4                    Bukkit 2012-06               18              475              493
5                    Bukkit 2012-07               15              492              507
6                    Bukkit 2012-08               50              506              556
...
Thomas
  • 43,637
  • 12
  • 109
  • 140
  • I try this and get `get_ext(file) : file has no extension` – Adrian May 28 '15 at 02:16
  • @Adrian There was a small typo in the most recent Github version. Either install the older version from CRAN or reinstall from Github and it should work for you. – Thomas May 28 '15 at 05:47
  • Thanks - problem is fixed. Your solution is the only one that worked for me (Windows 8.1) – Adrian May 28 '15 at 07:05
2

Seems nowadays GitHub wants you to go through their API to fetch content. I used the gh package as follows:

require(gh)

tmp = tempfile()
qurl = 'https://raw.githubusercontent.com/aronlindberg/latent_growth_classes/master/LGC_data.csv'
# download
gh(paste0('GET ', qurl), .destfile = tmp, .overwrite = TRUE)
# read
read.csv(tmp)

The important part is that you provide an personal access token (PAT). Either through the gh(.token = ) argument, or as I did, by setting the PAT globally in an ~/.Renviron file [1]. Of course you first have to create the PAT at your GitHub account.

[1] ~/.Renviron, I guess is searched first by all r-lib packages, as gh is one. The token therein should look like this:

GITHUB_PAT = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

You could also use the usethis package to set up the PAT.

andschar
  • 3,504
  • 2
  • 27
  • 35
0

curl might not work in windows at least for me

This is what worked for me in Windows

download.file("https://github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv", 
    destfile = "/tmp/test.csv",method="wininet")

In Linux

download.file("https://github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv", 
    destfile = "/tmp/test.csv",method="curl")
RyanFrost
  • 1,400
  • 7
  • 17
akhil vangala
  • 1,043
  • 1
  • 10
  • 11
0

A rather dummy way... using copy/paste from clipboard

x <- read.table(file = "clipboard", sep = "t", header=TRUE)
Lefty
  • 368
  • 4
  • 11
0

As mentioned by other postings, just go to the link for the raw code on github.

For example:

x <- read.csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2018/2018-04-23/week4_australian_salary.csv")
canovasjm
  • 501
  • 1
  • 3
  • 11
zeejay
  • 11
  • 2