I just found out how to import loads of historical stock prices into an array at http://www.pythonforfinance.net/2017/01/21/investment-portfolio-optimisation-with-python/
I did this using this code
import pandas_datareader as pdr
import pandas as pd
import numpy as np
import fix_yahoo_finance as yf
#yf.pdr_override()
source = 'yahoo'
stocks = ['AAPL', 'AMZN', 'MSFT', 'GOOG', 'GE']
start = '2017-01-01'
end_date = '2017-12-10'
data_array = pdr.DataReader(stocks, source, start,)['Adj Close']
print (data_array)
The issue is that I have a warning
Panel is deprecated and will be removed in a future version. The recommended way to represent these types of 3-dimensional data are with a MultiIndex on a DataFrame, via the Panel.to_frame() method Alternatively, you can use the xarray package http://xarray.pydata.org/en/stable/. Pandas provides a
.to_xarray()
method to help automate this conversion.
So I need advice, my goal is to save the data to CSV or excel file. Second step will be to take the average of the last x days of prices and divide the current price by the average and finally then to do a percent rank on the stocks for each day. Given the multiple steps, I figured to write it all to excel one step after another and to save myself from having to recalculate all history each day.
QUESTION: Given this ambition, what would be better method: Xarray or MultiIndex on DataFrame?