Import csv from Kaggle url into a pandas DataFrame

Question

I want to import a public dataset from Kaggle (https://www.kaggle.com/unsdsn/world-happiness?select=2017.csv) into a local jupyter notebook. I don't want to use any credencials in the process.

I saw diverse solutions including: pd.read_html, pd.read_csv, pd.read_table (pd = pandas). I also found the solutions that imply a login.

The first set of solutions are the ones I am interested in, though I see that they work on other websites because there is a link to the raw data. I have been clincking everywhere in the kaggle interface but find no direct url to raw data.

Bottom line: Is it possible to use say pd.read_csv to directly get data from the website into your local notebook? If so, how?

Show us what you tried and explain how it failed to meet your needs. — Paul H, Sep 16 '21 at 23:22
It is usually possible to use `import pandas as pd; df = pd.read_csv(url)` directly. — Felipe Whitaker, Sep 17 '21 at 00:02
With that you get a table with the html headers from the page. The data is not even among in output. That works if you have the raw data page, which I can't find for kaggle datasets... I saw that command being used and working with a github url pointing directly at a dataset. — Sapiens, Sep 22 '21 at 19:21
Does this answer your question? [Import Kaggle csv from download url to pandas DataFrame](https://stackoverflow.com/questions/43516982/import-kaggle-csv-from-download-url-to-pandas-dataframe) — Aidan Feldman, Feb 27 '23 at 21:19

Rob Raymond · Answer 1 · 2021-09-17T08:15:13.120

You can automate kaggle.cli
follow the instructions to download and save kaggle.json for authentication https://github.com/Kaggle/kaggle-api

import kaggle.cli
import sys
import pandas as pd
from pathlib import Path
from zipfile import ZipFile

# download data set
# https://www.kaggle.com/unsdsn/world-happiness?select=2017.csv
dataset = "unsdsn/world-happiness"
sys.argv = [sys.argv[0]] + f"datasets download {dataset}".split(" ")
kaggle.cli.main()

zfile = ZipFile(f"{dataset.split('/')[1]}.zip")

dfs = {f.filename:pd.read_csv(zfile.open(f)) for f in zfile.infolist() }

dfs["2017.csv"]

Import csv from Kaggle url into a pandas DataFrame

1 Answers1

Linked