17

I have a DataFrame with two indices and would like to reindex it by one of the indices.

from pandas_datareader import data
import matplotlib.pyplot as plt
import pandas as pd

# Instruments to download
tickers = ['AAPL']

# Online source one should use
data_source = 'yahoo'

# Data range
start_date = '2000-01-01'
end_date = '2018-01-09'

# Load the desired data
panel_data = data.DataReader(tickers, data_source, start_date, end_date).to_frame()
panel_data.head()

Screenshot

The reindexing goes as follows:

# Get just the adjusted closing prices
adj_close = panel_data['Adj Close']

# Gett all weekdays between start and end dates
all_weekdays = pd.date_range(start=start_date, end=end_date, freq='B')

# Align the existing prices in adj_close with our new set of dates
adj_close = adj_close.reindex(all_weekdays, method="ffill")

The last line gives the following error:

TypeError: '<' not supported between instances of 'tuple' and 'int'

This is because the DataFrame index is a list of tuples:

panel_data.index[0]
(Timestamp('2018-01-09 00:00:00'), 'AAPL')

Is it possible to reindex adj_close? By the way, if I don't convert the Panel object to a DataFrame using to_frame(), the reindexing works as it is. But it seems that Panel objects are deprecated...

cs95
  • 379,657
  • 97
  • 704
  • 746
Bruno
  • 1,329
  • 2
  • 15
  • 35
  • 1
    Maybe you're looking for `adj_close.reindex(all_weekdays, level=0).ffill()`? – cs95 Jan 10 '18 at 02:12
  • Perfect! That worked. I'd be glad to accept it as an answer. Thanks a lot for the quick response! – Bruno Jan 10 '18 at 02:17

1 Answers1

21

If you're looking to reindex on a certain level, then reindex accepts a level argument you can pass -

adj_close.reindex(all_weekdays, level=0)

When passing a level argument, you cannot pass a method argument at the same time (reindex throws a TypeError), so you can chain a ffill call after -

adj_close.reindex(all_weekdays, level=0).ffill()
cs95
  • 379,657
  • 97
  • 704
  • 746