Read in csv file from a website in Python3

Question

I am trying to read in a csv file directly from a website. Below is the Python3 code:

import pandas as pd
url = "https://www.w3resource.com/python-exercises/pandas/plotting/alphabet_stock_data.csv"
data = pd.read_csv(url)

But I got the following error:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
Input In [6], in <cell line: 3>()
      1 import pandas as pd
      2 url = "https://www.w3resource.com/python-exercises/pandas/plotting/alphabet_stock_data.csv"
----> 3 data = pd.read_csv(url)

File ~/opt/anaconda3/lib/python3.9/site-packages/pandas/util/_decorators.py:311, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    305 if len(args) > num_allow_args:
    306     warnings.warn(
    307         msg.format(arguments=arguments),
    308         FutureWarning,
    309         stacklevel=stacklevel,
    310     )
--> 311 return func(*args, **kwargs)

Any clue? Many thanks.

See if this can help https://stackoverflow.com/questions/32400867/pandas-read-csv-from-url — Aman ZeeK Verma, Sep 01 '22 at 02:22
The end of stacktrace error is: `HTTPError: HTTP Error 403: Forbidden` I tried to add a header to read_csv, but I still get the same error. — Adrien Pacifico, Sep 01 '22 at 07:38

score 1 · Answer 1 · answered Sep 01 '22 at 04:49

I like to use requests with pandas.

from io import StringIO

import pandas as pd
import requests


def get_data() -> pd.DataFrame:
    url = "https://www.w3resource.com/python-exercises/pandas/plotting/alphabet_stock_data.csv"

    with requests.Session() as request:
        response = request.get(url)
    if response.status_code != 200:
        print(response.raise_for_status())

    return pd.read_csv(StringIO(response.text), sep=",")


print(get_data())

A great alternative. Thank you. – Sophia Sep 05 '22 at 19:57 — Sophia, Sep 05 '22 at 19:57

score 1 · Accepted Answer · answered Sep 01 '22 at 07:45

You should specify the storage_options argument:

import pandas as pd

url = "https://www.w3resource.com/python-exercises/pandas/plotting/alphabet_stock_data.csv"
storage_options = {'User-Agent': 'Mozilla/5.0'}
df = pd.read_csv(url, storage_options=storage_options)

Taken from: https://stackoverflow.com/a/68816828/5304366

Read in csv file from a website in Python3

2 Answers2