Download history stock prices automatically from yahoo finance in python

Question

Is there a way to automatically download historical prices of stocks from yahoo finance or google finance (csv format)? Preferably in Python.

check out http://scrape-google-finance.compunect.com/ it's a rather new open source PHP scraper for Google finance. It's free to use/modify and you can download all stock prices and all companies from Google. Should not be too difficult to learn from it and write the same in python. — John, Jun 22 '14 at 19:14

Def_Os · Answer 1 · 2016-09-15T02:33:14.270

When you're going to work with such time series in Python, pandas is indispensable. And here's the good news: it comes with a historical data downloader for Yahoo: pandas.io.data.DataReader.

from pandas.io.data import DataReader
from datetime import datetime

ibm = DataReader('IBM',  'yahoo', datetime(2000, 1, 1), datetime(2012, 1, 1))
print(ibm['Adj Close'])

Here's an example from the pandas documentation.

Update for pandas >= 0.19:

The pandas.io.data module has been removed from pandas>=0.19 onwards. Instead, you should use the separate pandas-datareader package. Install with:

pip install pandas-datareader

And then you can do this in Python:

import pandas_datareader as pdr
from datetime import datetime

ibm = pdr.get_data_yahoo(symbols='IBM', start=datetime(2000, 1, 1), end=datetime(2012, 1, 1))
print(ibm['Adj Close'])

Downloading from Google Finance is also supported.

There's more in the documentation of pandas-datareader.

When I try it, the imports work fine but when I call the 'goog' line I receive an error: "IOError: after 3 tries, Yahoo! did not return a 200 for url 'http://ichart.finance.yahoo.com/table.csv?s=GOOG&a=0&b=1&c=2000&d=0&e=1&f=2012&g=d&ignore=.csv'" How could this be fixed? — Cleb, Aug 25 '15 at 15:01
@Cleb Seems to be because `GOOG` is not accepted by the API (don't understand why, after move to Alphabet GOOG ticker was kept). Works fine for `GOOGL` and various other symbols. Example adjusted just in case. — Def_Os, Oct 28 '15 at 02:10

score 42 · Accepted Answer · answered Sep 14 '12 at 23:29

Short answer: Yes. Use Python's urllib to pull the historical data pages for the stocks you want. Go with Yahoo! Finance; Google is both less reliable, has less data coverage, and is more restrictive in how you can use it once you have it. Also, I believe Google specifically prohibits you from scraping the data in their ToS.

Longer answer: This is the script I use to pull all the historical data on a particular company. It pulls the historical data page for a particular ticker symbol, then saves it to a csv file named by that symbol. You'll have to provide your own list of ticker symbols that you want to pull.

import urllib

base_url = "http://ichart.finance.yahoo.com/table.csv?s="
def make_url(ticker_symbol):
    return base_url + ticker_symbol

output_path = "C:/path/to/output/directory"
def make_filename(ticker_symbol, directory="S&P"):
    return output_path + "/" + directory + "/" + ticker_symbol + ".csv"

def pull_historical_data(ticker_symbol, directory="S&P"):
    try:
        urllib.urlretrieve(make_url(ticker_symbol), make_filename(ticker_symbol, directory))
    except urllib.ContentTooShortError as e:
        outfile = open(make_filename(ticker_symbol, directory), "w")
        outfile.write(e.content)
        outfile.close()

I get the following error: `AttributeError: module 'urllib' has no attribute 'ContentTooShortError'` — adrCoder, Oct 29 '19 at 10:26
This answer was written 7 years ago, for Python 2.7. In the intervening timespan, Python 3 has modified the `urllib` module to be broken into submodules. The exception class you're looking for can be found here: https://docs.python.org/3/library/urllib.error.html#urllib.error.ContentTooShortError — Joe C., Oct 29 '19 at 20:23

score 14 · Answer 3 · edited Jun 09 '17 at 07:59

Extending @Def_Os's answer with an actual demo...

As @Def_Os has already said - using Pandas Datareader makes this task a real fun

In [12]: from pandas_datareader import data

pulling all available historical data for AAPL starting from 1980-01-01

#In [13]: aapl = data.DataReader('AAPL', 'yahoo', '1980-01-01')

# yahoo api is inconsistent for getting historical data, please use google instead.
In [13]: aapl = data.DataReader('AAPL', 'google', '1980-01-01')

first 5 rows

In [14]: aapl.head()
Out[14]:
                 Open       High     Low   Close     Volume  Adj Close
Date
1980-12-12  28.750000  28.875000  28.750  28.750  117258400   0.431358
1980-12-15  27.375001  27.375001  27.250  27.250   43971200   0.408852
1980-12-16  25.375000  25.375000  25.250  25.250   26432000   0.378845
1980-12-17  25.875000  25.999999  25.875  25.875   21610400   0.388222
1980-12-18  26.625000  26.750000  26.625  26.625   18362400   0.399475

last 5 rows

In [15]: aapl.tail()
Out[15]:
                 Open       High        Low      Close    Volume  Adj Close
Date
2016-06-07  99.250000  99.870003  98.959999  99.029999  22366400  99.029999
2016-06-08  99.019997  99.559998  98.680000  98.940002  20812700  98.940002
2016-06-09  98.500000  99.989998  98.459999  99.650002  26419600  99.650002
2016-06-10  98.529999  99.349998  98.480003  98.830002  31462100  98.830002
2016-06-13  98.690002  99.120003  97.099998  97.339996  37612900  97.339996

save all data as CSV file

In [16]: aapl.to_csv('d:/temp/aapl_data.csv')

d:/temp/aapl_data.csv - 5 first rows

Date,Open,High,Low,Close,Volume,Adj Close
1980-12-12,28.75,28.875,28.75,28.75,117258400,0.431358
1980-12-15,27.375001,27.375001,27.25,27.25,43971200,0.408852
1980-12-16,25.375,25.375,25.25,25.25,26432000,0.378845
1980-12-17,25.875,25.999999,25.875,25.875,21610400,0.38822199999999996
1980-12-18,26.625,26.75,26.625,26.625,18362400,0.399475
...

Naufal · Answer 4 · 2018-12-20T14:00:18.230

8

There is already a library in Python called yahoo_finance so you'll need to download the library first using the following command line:

sudo pip install yahoo_finance

Then once you've installed the yahoo_finance library, here's a sample code that will download the data you need from Yahoo Finance:

#!/usr/bin/python
import yahoo_finance
import pandas as pd

symbol = yahoo_finance.Share("GOOG")
google_data = symbol.get_historical("1999-01-01", "2016-06-30")
google_df = pd.DataFrame(google_data)

# Output data into CSV
google_df.to_csv("/home/username/google_stock_data.csv")

This should do it. Let me know if it works.

UPDATE: The yahoo_finance library is no longer supported.

edited Dec 20 '18 at 14:00

answered Jul 26 '16 at 15:53

Naufal

1,203
14
12

Just curious -- what benefit does yahoo-finance provide over pandas_datareader (or vice versa)? – Mathematician Jan 31 '18 at 04:35
2

At the time that was the only library I could find in relation to pulling Yahoo Finance Stock Prices. However, it doesn't seem to be working anymore. – Naufal Jan 31 '18 at 10:07
4

Affirmed - 21/02/2018 - `yahoo_finance` no longer works. – 3kstc Feb 20 '18 at 23:40
Thanks for this 3kstc – Naufal Mar 13 '18 at 11:17
1

Yahoo finance API is no longer supported. – MyopicVisage Sep 22 '18 at 18:44

atreadw · Answer 5 · 2021-08-28T04:57:40.243

4

You can check out the yahoo_fin package. It was initially created after Yahoo Finance changed their API (documentation is here: http://theautomatic.net/yahoo_fin-documentation).

from yahoo_fin import stock_info as si

aapl_data = si.get_data("aapl")

nflx_data = si.get_data("nflx")

aapl_data.head()

nflx_data.head()

aapl_data.to_csv("aapl_data.csv")

nflx_data.to_csv("nflx_data.csv")

edited Aug 28 '21 at 04:57

answered Apr 13 '19 at 02:26

atreadw

289
2
6

Thank you for this reference, this was the only ready solution that work for me. All other packages mentioned here seem to be outdated and don't work, probably because of the API changes in Yahoo Finance.\ – Tim Mironov Jul 02 '19 at 11:53
typo: aapl.to_csv("aapl_data.csv")` should be `aapl_data.to_csv("aapl_data.csv")` – ShpielMeister Aug 27 '21 at 00:54

score 2 · Answer 6 · answered Oct 15 '20 at 14:22

2

It's trivial when you know how:

import yfinance as yf
df = yf.download('CVS', '2015-01-01')
df.to_csv('cvs-health-corp.csv')

If you wish to plot it:

import finplot as fplt
fplt.candlestick_ochl(df[['Open','Close','High','Low']])
fplt.show()

answered Oct 15 '20 at 14:22

Jonas Byström

25,316
23
100
147

Download history stock prices automatically from yahoo finance in python

6 Answers6

Linked