4

I am looking for the easiest way to download the kaggle competition data (train and test) on the virtual machine using bash to be able to train it there without uploading it on git.

mannuscript
  • 4,711
  • 6
  • 26
  • 25

4 Answers4

3

Fast-forward three years later and you can use Kaggle's API using the CLI, for example:

kaggle competitions download favorita-grocery-sales-forecasting

gosuto
  • 5,422
  • 6
  • 36
  • 57
2

First you need to copy your cookie information for kaggle site in a text file. There is a chrome extension which will help you to do this. Copy the cookie information and save it as cookies.txt.

Now transfer the file to the EC2 instance using the command

scp -i /path/my-key-pair.pem /path/cookies.txt user-name@ec2-xxx-xx-xxx-x.compute-1.amazonaws.com:~

Accept the competitions rules and copy the URLs of the datasets you want to download from kaggle.com. For example the URL to download the sample_submission.csv file of Intel & MobileODT Cervical Cancer Screening competition is: https://kaggle.com/c/intel-mobileodt-cervical-cancer-screening/download/sample_submission.csv.zip

Now, from the terminal use the following command to download the dataset into the instance.

wget -x --load-cookies cookies.txt https://kaggle.com/c/intel-mobileodt-cervical-cancer-screening/download/sample_submission.csv.zip
Ernest S Kirubakaran
  • 1,524
  • 12
  • 16
1

Install CurlWget chrome extension.

start downloading your kaggle data-set. CurlWget will give you full wget command. paste this command to terminal with sudo.

Job is done.

Ashish
  • 450
  • 4
  • 5
0
  1. Install cookies.txt extension on chrome and enable it.
  2. Login to kaggle
  3. Go to the challenge page that you want the data from
  4. Click on cookie.txt extension on top right and it download the current page's cookie. It will download the cookies in cookies.txt file
  5. Transfer the file to the remote service using scp or other methods
  6. Copy the data link shown on kaggle page (right click and copy link address)
  7. run wget -x --load-cookies cookies.txt <datalink>