3

I get this error message when I try to make a CSV file from the whole S&P 500:

Exception has occurred: pandas_datareader._utils.RemoteDataError

No data fetched for symbol 3M Company using YahooDailyReader

I think it's something wrong with this:

for row in table.findAll('tr') [1:]:
    ticker = row.findAll('td')[0:].text

Can somebody please help me? Thanks in advance.

Full code-

import bs4 as bs
import datetime as dt
import os
import pandas_datareader.data as web
import pickle
import requests

def save_sp500_tickers():
    resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
    soup = bs.BeautifulSoup(resp.text, 'lxml')
    table = soup.find('table', {'class': 'wikitable sortable'})
    tickers = []

    for row in table.findAll('tr') [1:]:
        ticker = row.findAll('td')[0:].text
        tickers.append(ticker)
    with open("sp500tickers.pickle", "wb") as f:
        pickle.dump(tickers, f)
    return tickers

# save_sp500_tickers()
def get_data_from_yahoo(reload_sp500=False):
    if reload_sp500:
        tickers = save_sp500_tickers()
    else:
        with open("sp500tickers.pickle", "rb") as f:
            tickers = pickle.load(f)
    if not os.path.exists('stock_dfs'):
        os.makedirs('stock_dfs')

    start = dt.datetime(2010, 1, 1)
    end = dt.datetime.now()
    for ticker in tickers:
    # just in case your connection breaks, we'd like to save our progress!
        if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
            df = web.DataReader(ticker, 'yahoo', start, end)
            df.reset_index(inplace=True)
            df.set_index("Date", inplace=True)
            df = df.drop("Symbol", axis=1)
            df.to_csv('stock_dfs/{}.csv'.format(ticker))
        else:
            print('Already have {}'.format(ticker))


 get_data_from_yahoo()
AS Mackay
  • 2,831
  • 9
  • 19
  • 25
Chagen
  • 45
  • 1
  • 1
  • 8

5 Answers5

9

There are various out of date sections of code. The solution I found required installing fix_yahoo_finance and yfinance using:

pip install yfinance
pip install fix_yahoo_finance

This seemed to work for me, full code bellow.

import bs4 as bs
import datetime as dt
import os
from pandas_datareader import data as pdr
import pickle
import requests
import fix_yahoo_finance as yf

yf.pdr_override()

def save_sp500_tickers():
    resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
    soup = bs.BeautifulSoup(resp.text, 'lxml')
    table = soup.find('table', {'class': 'wikitable sortable'})
    tickers = []
    for row in table.findAll('tr')[1:]:
        ticker = row.findAll('td')[0].text.replace('.', '-')
        ticker = ticker[:-1]
        tickers.append(ticker)
    with open("sp500tickers.pickle", "wb") as f:
        pickle.dump(tickers, f)
    return tickers


# save_sp500_tickers()
def get_data_from_yahoo(reload_sp500=False):
    if reload_sp500:
        tickers = save_sp500_tickers()
    else:
        with open("sp500tickers.pickle", "rb") as f:
            tickers = pickle.load(f)
    if not os.path.exists('stock_dfs'):
        os.makedirs('stock_dfs')
    start = dt.datetime(2019, 6, 8)
    end = dt.datetime.now()
    for ticker in tickers:
        print(ticker)
        if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
            df = pdr.get_data_yahoo(ticker, start, end)
            df.reset_index(inplace=True)
            df.set_index("Date", inplace=True)
            df.to_csv('stock_dfs/{}.csv'.format(ticker))
        else:
            print('Already have {}'.format(ticker))


save_sp500_tickers()
get_data_from_yahoo()
Remolten
  • 2,614
  • 2
  • 25
  • 29
Luc McCutcheon
  • 109
  • 2
  • 7
  • I use exact this code, but I get this error: `KeyError: "name='B', domain=None, path=None"` – Milad Jul 17 '22 at 05:37
6

Scraping from wikipedia returns 'MMM/n' to the pickle file.

Add

ticker = ticker[:-1]

to

for row in table.findAll('tr')[1:]: 
    ticker = row.findAll('td')[0].text
    ticker = ticker[:-1]
    tickers.append(ticker)

and re-generate your pickle file.

That should leave the tickers as 'MMM' and not 'MMM/n'

rugy
  • 196
  • 2
  • 11
2

You will need to consider companies that do not exist anymore, Time line that doesn't work with your start and end parameters, or is not recognized by the yahoo module. This worked well for me

failed = []
passed = []
data = pd.DataFrame()
for x in s&p_symbols:
    try:
    data[x] = web.DataReader(x, data_source= "yahoo", start = "2019-1-1")["Adj Close"]
    passed.append(x)
except (IOError, KeyError):
    msg = 'Failed to read symbol: {0!r}, replacing with NaN.'
    failed.append(x)
berkat0789
  • 51
  • 3
0

If the same error occurs even after you edited your code based on Luc McCutcheon's response, you just need to run the same function get_data_from_yahoo() after a certain time. I believe this happens because Yahoo Finance throttles the number of requests you can give.

Sayyor Y
  • 1,130
  • 2
  • 14
  • 27
0

I use this code to solve it:

failed=[]
passed=[]

def collect_data(data):
  mydata = pd.DataFrame()
  for t in data:
    try:
      mydata[t] = wb.DataReader(t,data_source='yahoo',start='01-10-2019')['Adj Close']
      passed.append(t)
    except (IOError, KeyError):
      msg= 'NaN'
      failed.append(t)

  print(mydata)
  return mydata
  • 1
    While code-only answers might answer the question, you could significantly improve the quality of your answer by providing context for your code, a reason for why this code works, and some references to documentation for further reading. From [answer]: _"Brevity is acceptable, but fuller explanations are better."_ – Pranav Hosangadi Sep 16 '20 at 18:15