Python 3.6 Downloading .csv files from finance.yahoo.com using requests module

Question

I was trying to download a .csv file from this url for the history of a stock. Here's my code:

    import requests
    r = requests.get("https://query1.finance.yahoo.com/v7/finance/download/CHOLAFIN.BO?period1=1514562437&period2=1517240837&interval=1d&events=history&crumb=JaCfCutLNr7")
    file = open(r"history_of_stock.csv", 'w')
    file.write(r.text)
    file.close()

But when I opened the file history_of_stock.csv, this was what I found:

    {
        "finance": {
            "error": {
                "code": "Unauthorized",
                "description": "Invalid cookie"
            }
        }
    }

I couldn't find anything that could fix my problem. I found this thread in which someone has the same problem except that it is in C#: C# Download price data csv file from https instead of http

score 2 · Answer 1 · edited Nov 13 '18 at 08:20

To complement the earlier answer and provide a concrete completed code, I wrote a script which accomplishes the task of getting historical stock prices in Yahoo Finance. Tried to write it as simply as possible. To give a summary: when you use requests to get a URL, in many instances you don't need to worry about crumbs or cookies. However, with Yahoo finance, you need to get the crumbs and the cookies. Once you get the cookies, then you are good to go! Make sure to set a timeout on the requests.get call.

import re
import requests
import sys
from pdb import set_trace as pb

symbol = sys.argv[-1]
start_date = '1442203200' # start date timestamp
end_date   = '1531800000' # end date timestamp

crumble_link = 'https://finance.yahoo.com/quote/{0}/history?p={0}'
crumble_regex = r'CrumbStore":{"crumb":"(.*?)"}'
cookie_regex = r'set-cookie: (.*?);'
quote_link = 'https://query1.finance.yahoo.com/v7/finance/download/{}?period1={}&period2={}&interval=1d&events=history&crumb={}'

link = crumble_link.format(symbol)
session = requests.Session()
response = session.get(link)

# get crumbs

text = str(response.content)
match = re.search(crumble_regex, text)
crumbs = match.group(1)

# get cookie

cookie = session.cookies.get_dict()

url = "https://query1.finance.yahoo.com/v7/finance/download/%s?period1=%s&period2=%s&interval=1d&events=history&crumb=%s" % (symbol, start_date, end_date, crumbs)

r = requests.get(url,cookies=session.cookies.get_dict(),timeout=5, stream=True)

out = r.text

filename = '{}.csv'.format(symbol)

with open(filename,'w') as f:
    f.write(out)

score 1 · Accepted Answer · answered Jan 31 '18 at 16:20

There was a service for exactly this but it was discontinued.

Now you can do what you intend but first you need to get a Cookie. On this post there is an example of how to do it. Basically, first you need to make a useless request to get the Cookie and later, with this Cookie in place, you can query whatever else you actually need.

There's also a post about another service which might make your life easier.

There's also a Python module to work around this inconvenience and code to show how to do it without it.

I could not find a way o link to specific posts... :( – Javier Jan 31 '18 at 16:21 — Javier, Jan 31 '18 at 16:21

Python 3.6 Downloading .csv files from finance.yahoo.com using requests module

2 Answers2