2

I can read the "AAPL" symbol historical data from yahoo

dfcomp3 = web.DataReader(["AAPL"],'yahoo',start=start,end=end)['Adj Close']

I can read the "GE" symbol historical data from yahoo

dfcomp3 = web.DataReader(["AAPL"],'yahoo',start=start,end=end)['Adj Close']

I can read the "BTC-USD" symbol historical data from yahoo

dfcomp3 = web.DataReader(["BTC-USD"],'yahoo',start=start,end=end)['Adj Close']

I can read both "AAPL","GE" symbols historical data from yahoo

dfcomp7 = web.DataReader(["GE", "AAPL"],'yahoo',start=start,end=end)['Adj Close']

I can't read both "AAPL","BTC-USD" symbols historical data from yahoo

dfcomp7 = web.DataReader(["BTC-USD", "AAPL"],'yahoo',start=start,end=end)['Adj Close']

    ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-58-0cbbb3aa9346> in <module>()
----> 1 dfcomp7 = web.DataReader(["BTC-USD", "AAPL" ],'yahoo',start=start,end=end)['Adj Close']

7 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/reshape/reshape.py in _make_selectors(self)
    164 
    165         if mask.sum() < len(self.index):
--> 166             raise ValueError('Index contains duplicate entries, '
    167                              'cannot reshape')
    168 

ValueError: Index contains duplicate entries, cannot reshape

Why?

Zoe
  • 27,060
  • 21
  • 118
  • 148
mhndlsz
  • 106
  • 1
  • 8
  • Possible duplicate of https://stackoverflow.com/questions/28651079/pandas-unstack-problems-valueerror-index-contains-duplicate-entries-cannot-re – Kshitij Saxena Sep 11 '19 at 09:13

2 Answers2

0

go in debug mode and do an value_counts() on self.index. that way you will see which date with which symbol is creating the issue.

when BTC-USD is downloaded by itself, it doesn't create this issue as pandas-datareader is unstacking and all symbols are becoming column names. Which is not a problem while there is just one symbol. However with many symbols it leads to an error while unstacking.

I had same problems with following symbol CBS, STI, VIAB for the dates 4th Dec 19 and 6th Dec 19.

mike
  • 183
  • 7
0

Realize this is an old question, but I was having the same problem with a Yahoo Finance download. I believe this particular problem is very specific to Yahoo, where for some reason it is sending multiple prices for a single day. One of the suggestions involves reindexing, but because of how 'DataReader' converts to pandas, you can't get the dataframe to create at all, thus won't be able to reindex.

Here is my solution. I've included a try except since I think the problem might occur temporarily (e.g., Yahoo fixes the duplicate in the future) and since my code runs every day, I wanted to be flexible to either having this problem or not depending on the output sent. I'm using the major Russell indices as my sample here. This code tries doing the normal way and if an IndexError is thrown, then loops each symbol individually, drops any duplicates (keeping the first by default) and merges the dataframes into one.

def get_yahoo():
    start = dt.datetime(1995, 12, 31)
    end = dt.datetime.today()
    yh_fields = ['^RLG', '^RLV', '^RUO', '^RUJ']
    try:
        yho = web.DataReader(yh_fields, 'yahoo', start, end)['Adj Close']
    except ValueError:
        yho = pd.DataFrame()
        for y in yh_fields:
            temp = web.DataReader(y, 'yahoo', start, end)['Adj Close']
            temp = temp.rename(y)
            temp = temp[~temp.index.duplicated()]
            yho = yho.join(temp, how='outer')
    return yho
Tom
  • 1,003
  • 2
  • 13
  • 25