2

I am trying to download data into R from Kaggle using the below command. The datasets I am trying to download are located here.

library(httr)
dataset <- GET("https://www.kaggle.com/api/v1/competitions/data/download/10445/train.csv", 
         authenticate(username, authkey, type = "basic"))

The variable dataset is of type "application/zip". Can someone help me get the csv file from inside the link?(I used http_type(train) Please let me know if my question is unclear

Edit: Included library name based on comments.

user13874
  • 367
  • 2
  • 10
  • 3
    Where does function `GET` come from? Secondly, I get "This page isn't working" when I try the link. – Maurits Evers Mar 21 '19 at 02:34
  • The function comes from the package `httr`. It's written by Hadley Wickham. https://www.rdocumentation.org/packages/httr/versions/1.4.0/topics/GET – user13874 Mar 21 '19 at 02:42
  • 1
    Always *explicitly* include any non base R library calls; in your case `library(httr)`. – Maurits Evers Mar 21 '19 at 02:43
  • @MauritsEvers: The url in the GET function doesn't work by itself. But it can help pull data based on the authentication. The datasets are located at https://www.kaggle.com/c/elo-merchant-category-recommendation/data. I am struggling to pull a dataset from Kaggle into R directly. If there's a more elegant way to do it, I am all eyes and ears. Neither kaggler package nor some functions I found on Kaggle worked for me – user13874 Mar 21 '19 at 02:47

1 Answers1

5

I found a solution based on the answer posted here. Someone posted the link in the comment but I don't see the comment any more. Thank you Good Samaritan!

library(httr)
dataset <- httr::GET("https://www.kaggle.com/api/v1/competitions/data/download/10445/train.csv", 
                 httr::authenticate(username, authkey, type = "basic"))

temp <- tempfile()
download.file(dataset$url,temp)
data <- read.csv(unz(temp, "train.csv"))
unlink(temp)
user13874
  • 367
  • 2
  • 10
  • 2
    It's perfectly acceptable to answer and accept your own answer. So you might want to set the green checkmark next to the answer to close the question. – Maurits Evers Mar 21 '19 at 04:37
  • 1
    Stackoverflow forces people to wait for two days before accepting their own solution as an answer. I will keep tab and check the green mark in two days – user13874 Mar 21 '19 at 04:45
  • 1
    If I may ask, how did you go about getting the download link from kaggle – Confusion Matrix Oct 28 '22 at 16:07