3

i'm a beginner in write code with python.

I wrote this simple script using Pandas and his DataReader to retrieve multiple stock results from yahoo finance:

import pandas as pd
from pandas.io.data import DataReader
from pandas import DataFrame

symbols_list = ['AAPL', 'TSLA', 'YHOO','GOOG', 'MSFT','GILD']

for ticker in symbols_list: 
r = DataReader(ticker, "yahoo", '2015-01-20')
cell= r[['Open','High','Low','Adj Close','Volume']]

print cell

With this code i obtain the price stocks with the date + the others column that i specified in "cell= r[[...." as shown below:

                      Open    High     Low  Adj Close    Volume
         Date                                                   
         2015-01-20  107.84  108.97  106.50     108.72  49899900
         2015-01-21  108.95  111.06  108.27     109.55  48575900
         2015-01-22  110.26  112.47  109.72     112.40  53796400
         2015-01-23  112.30  113.75  111.53     112.98  46464800
         2015-01-26  113.74  114.36  112.80     113.10  55375900
                       Open    High     Low  Adj Close   Volume
         Date                                                  
         2015-01-20  193.87  194.12  187.04     191.93  4489400
         2015-01-21  189.55  198.68  189.51     196.57  4144000
         2015-01-22  197.00  203.24  195.20     201.62  4094100
         2015-01-23  200.29  203.50  198.33     201.29  3438600
         2015-01-26  201.83  208.62  201.05     206.55  3224500

My questions are: how can i include in the columns the tickers that i specified in the symbol_list? And a last thing: how can i invert the order of the dates? i want it to show the newest first (2015-01-26 in my example). Below i show you an example of what i want to obtain (the ticker name as first column and the date order inverted)

  TSLA 2015-01-26 201.83 208.62 201.05 206.55 3224500
  TSLA 2015-01-23 200.29 203.50 198.33 201.29 3438600
  TSLA 2015-01-22 197.00 203.24 195.20 201.62 4094100
  TSLA,2015-01-21 189.55 198.68 189.51 196.57 4144000
  TSLA 2015-01-20 193.87 194.12 187.04 191.93 4489400

  AAPL 2015-01-26 113.74 114.36 112.80 113.10 55375900
  AAPL 2015-01-23 112.30 113.75 111.53 112.98 46464800
  AAPL 2015-01-22 110.26 112.47 109.72 112.40 53796400
  AAPL 2015-01-21 108.95 111.06 108.27 109.55 48575900
  AAPL 2015-01-20 107.84 108.97 106.50 108.72 49899900

I tried a few things founded in forums, but none of that gave me any result. Thank you all for your consideration, hope that someone can give me an hand on this 2 issues.

Andrea
  • 152
  • 1
  • 9
  • So are asking how to have a single df with all the ticker data? Or a dict for all the ticker data? Or perhaps you want a [`panel`](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#panel) – EdChum Jan 27 '15 at 15:40

2 Answers2

4

UPDATE CODE COMPLETED

With the precious help of @EdChum, i've completed my code and my request in this post has completely satisfied(again, thanks for the big help). The final code for obtaining multiple stock prices, a set of columns + the ticker, and inverting the order of the dates(and tickers), is shown below:

import datetime
import pandas as pd
from pandas import DataFrame
from pandas.io.data import DataReader
symbols_list = ['AAPL', 'TSLA', 'YHOO','GOOG', 'MSFT','ALTR','WDC','KLAC']

symbols=[]
for ticker in symbols_list: 
    r = DataReader(ticker, "yahoo", 
                   start=datetime.datetime(2014, 12, 30))
    # add a symbol column
    r['Symbol'] = ticker 
    symbols.append(r)
# concatenate all the dfs
df = pd.concat(symbols)
#define cell with the columns that i need
cell= df[['Symbol','Open','High','Low','Adj Close','Volume']]
#changing sort of Symbol (ascending) and Date(descending) setting Symbol as first column and changing date format
cell.reset_index().sort(['Symbol', 'Date'], ascending=[1,0]).set_index('Symbol').to_csv('stock.csv', date_format='%d/%m/%Y')

Hope this code will help other users.

Andrea
  • 152
  • 1
  • 9
  • Happy to have helped you out – EdChum Jan 29 '15 at 09:24
  • Do you know why this is happening: `error: after 3 tries, Yahoo! did not return a 200 for url 'http://ichart.finance.yahoo.com/table.csv?s=GOOG&a=11&b=30&c=2014&d=2&e=27&f=2015&g=d&ignore=.csv'` – Zhubarb Mar 27 '15 at 10:13
  • maybe this will solve your problem: http://stackoverflow.com/questions/23642194/loading-data-from-yahoo-finance-with-pandas – Andrea Apr 11 '15 at 11:46
1

I think I understand what you want, I would append each df to a list and use concat to make a single df, for each df we can just add a new column for each symbol:

In [54]:

from pandas.io.data import DataReader
symbols_list = ['AAPL', 'TSLA', 'YHOO','GOOG', 'MSFT','GILD']

symbols=[]
for ticker in symbols_list: 
    r = DataReader(ticker, "yahoo", '2015-01-20')
    # add a symbol column
    r['Symbol'] = ticker
    symbols.append(r)
# concatenate all the dfs
df = pd.concat(symbols)
df.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 30 entries, 2015-01-20 00:00:00 to 2015-01-26 00:00:00
Data columns (total 7 columns):
Open         30 non-null float64
High         30 non-null float64
Low          30 non-null float64
Close        30 non-null float64
Volume       30 non-null int64
Adj Close    30 non-null float64
Symbol       30 non-null object
dtypes: float64(5), int64(1), object(1)
memory usage: 1.9+ KB
In [55]:

df
Out[55]:
              Open    High     Low   Close    Volume  Adj Close Symbol
Date                                                                  
2015-01-20  107.84  108.97  106.50  108.72  49899900     108.72   AAPL
2015-01-21  108.95  111.06  108.27  109.55  48575900     109.55   AAPL
2015-01-22  110.26  112.47  109.72  112.40  53796400     112.40   AAPL
2015-01-23  112.30  113.75  111.53  112.98  46464800     112.98   AAPL
2015-01-26  113.74  114.36  112.80  113.10  55375900     113.10   AAPL
2015-01-20  193.87  194.12  187.04  191.93   4503200     191.93   TSLA
2015-01-21  189.55  198.68  189.51  196.57   4153000     196.57   TSLA
2015-01-22  197.00  203.24  195.20  201.62   4094100     201.62   TSLA
2015-01-23  200.29  203.50  198.33  201.29   3438600     201.29   TSLA
2015-01-26  201.83  208.62  201.05  206.55   3224500     206.55   TSLA
2015-01-20   46.79   47.89   46.77   47.63  15845900      47.63   YHOO
2015-01-21   47.60   48.38   47.32   48.18  16305100      48.18   YHOO
2015-01-22   48.43   49.08   48.01   48.89  12647400      48.89   YHOO
2015-01-23   48.74   49.23   48.63   48.95  14206100      48.95   YHOO
2015-01-26   49.57   49.79   49.07   49.44  18928700      49.44   YHOO
2015-01-20  511.00  512.50  506.02  506.90   2221800     506.90   GOOG
2015-01-21  507.25  519.28  506.20  518.04   2262500     518.04   GOOG
2015-01-22  521.48  536.33  519.70  534.39   2653600     534.39   GOOG
2015-01-23  535.59  542.17  533.00  539.95   2266800     539.95   GOOG
2015-01-26  538.53  539.00  529.67  535.21   1532400     535.21   GOOG
2015-01-20   46.30   46.65   45.57   46.39  36161900      46.39   MSFT
2015-01-21   45.94   46.14   45.48   45.92  39081100      45.92   MSFT
2015-01-22   46.38   47.14   46.08   47.13  35898000      47.13   MSFT
2015-01-23   47.36   47.39   46.80   47.18  26211600      47.18   MSFT
2015-01-26   47.00   47.13   46.24   47.01  39577800      47.01   MSFT
2015-01-20  101.14  104.05  100.15  103.78  18838100     103.78   GILD
2015-01-21  103.04  105.07  102.57  103.58  12312500     103.58   GILD
2015-01-22  104.14  105.18  101.56  105.14  12963100     105.14   GILD
2015-01-23  104.40  106.11  103.80  105.54   9566700     105.54   GILD
2015-01-26  104.87  107.77  104.87  107.11  10254600     107.11   GILD

To resort call sort and pass ascending=False:

In [56]:

df.sort(ascending=False)
Out[56]:
              Open    High     Low   Close    Volume  Adj Close Symbol
Date                                                                  
2015-01-26  104.87  107.77  104.87  107.11  10254600     107.11   GILD
2015-01-26  201.83  208.62  201.05  206.55   3224500     206.55   TSLA
2015-01-26  538.53  539.00  529.67  535.21   1532400     535.21   GOOG
2015-01-26   49.57   49.79   49.07   49.44  18928700      49.44   YHOO
2015-01-26   47.00   47.13   46.24   47.01  39577800      47.01   MSFT
2015-01-26  113.74  114.36  112.80  113.10  55375900     113.10   AAPL
2015-01-23  104.40  106.11  103.80  105.54   9566700     105.54   GILD
2015-01-23   47.36   47.39   46.80   47.18  26211600      47.18   MSFT
2015-01-23  200.29  203.50  198.33  201.29   3438600     201.29   TSLA
2015-01-23  535.59  542.17  533.00  539.95   2266800     539.95   GOOG
2015-01-23   48.74   49.23   48.63   48.95  14206100      48.95   YHOO
2015-01-23  112.30  113.75  111.53  112.98  46464800     112.98   AAPL
2015-01-22  521.48  536.33  519.70  534.39   2653600     534.39   GOOG
2015-01-22   46.38   47.14   46.08   47.13  35898000      47.13   MSFT
2015-01-22  110.26  112.47  109.72  112.40  53796400     112.40   AAPL
2015-01-22   48.43   49.08   48.01   48.89  12647400      48.89   YHOO
2015-01-22  104.14  105.18  101.56  105.14  12963100     105.14   GILD
2015-01-22  197.00  203.24  195.20  201.62   4094100     201.62   TSLA
2015-01-21  507.25  519.28  506.20  518.04   2262500     518.04   GOOG
2015-01-21  189.55  198.68  189.51  196.57   4153000     196.57   TSLA
2015-01-21   47.60   48.38   47.32   48.18  16305100      48.18   YHOO
2015-01-21  103.04  105.07  102.57  103.58  12312500     103.58   GILD
2015-01-21   45.94   46.14   45.48   45.92  39081100      45.92   MSFT
2015-01-21  108.95  111.06  108.27  109.55  48575900     109.55   AAPL
2015-01-20  101.14  104.05  100.15  103.78  18838100     103.78   GILD
2015-01-20  107.84  108.97  106.50  108.72  49899900     108.72   AAPL
2015-01-20   46.30   46.65   45.57   46.39  36161900      46.39   MSFT
2015-01-20   46.79   47.89   46.77   47.63  15845900      47.63   YHOO
2015-01-20  193.87  194.12  187.04  191.93   4503200     191.93   TSLA
2015-01-20  511.00  512.50  506.02  506.90   2221800     506.90   GOOG

Update

If you want to retain the ticker symbol groupings and then sort by date then the following is needed. Here I have to reset the index so that the 'Date' becomes a regular column again, I then pass a list of columns which it will process in that order, finally I have to set the index back again:

In [44]:

df.reset_index().sort(['Symbol', 'Date'], ascending=False).set_index('Date')
Out[44]:
              Open    High     Low   Close     Volume  Adj Close Symbol
Date                                                                   
2015-01-27   49.14   49.28   47.66   47.99   44813600      47.99   YHOO
2015-01-26   49.57   49.79   49.07   49.44   18928700      49.44   YHOO
2015-01-23   48.74   49.23   48.63   48.95   14206100      48.95   YHOO
2015-01-22   48.43   49.08   48.01   48.89   12647400      48.89   YHOO
2015-01-21   47.60   48.38   47.32   48.18   16305100      48.18   YHOO
2015-01-20   46.79   47.89   46.77   47.63   15845900      47.63   YHOO
2015-01-27  204.42  208.03  203.30  205.98    2777700     205.98   TSLA
2015-01-26  201.83  208.62  201.05  206.55    3224500     206.55   TSLA
2015-01-23  200.29  203.50  198.33  201.29    3438600     201.29   TSLA
2015-01-22  197.00  203.24  195.20  201.62    4094100     201.62   TSLA
2015-01-21  189.55  198.68  189.51  196.57    4153000     196.57   TSLA
2015-01-20  193.87  194.12  187.04  191.93    4503200     191.93   TSLA
2015-01-27   42.95   43.20   42.11   42.66  169112700      42.66   MSFT
2015-01-26   47.00   47.13   46.24   47.01   42525500      47.01   MSFT
2015-01-23   47.36   47.39   46.80   47.18   26211600      47.18   MSFT
2015-01-22   46.38   47.14   46.08   47.13   35898000      47.13   MSFT
2015-01-21   45.94   46.14   45.48   45.92   39081100      45.92   MSFT
2015-01-20   46.30   46.65   45.57   46.39   36161900      46.39   MSFT
2015-01-27  529.97  530.70  518.19  518.63    1897300     518.63   GOOG
2015-01-26  538.53  539.00  529.67  535.21    1532400     535.21   GOOG
2015-01-23  535.59  542.17  533.00  539.95    2266800     539.95   GOOG
2015-01-22  521.48  536.33  519.70  534.39    2653600     534.39   GOOG
2015-01-21  507.25  519.28  506.20  518.04    2262500     518.04   GOOG
2015-01-20  511.00  512.50  506.02  506.90    2221800     506.90   GOOG
2015-01-27  105.94  107.02  104.86  105.03   10334500     105.03   GILD
2015-01-26  104.87  107.77  104.87  107.11   10254600     107.11   GILD
2015-01-23  104.40  106.11  103.80  105.54    9566700     105.54   GILD
2015-01-22  104.14  105.18  101.56  105.14   12963100     105.14   GILD
2015-01-21  103.04  105.07  102.57  103.58   12312500     103.58   GILD
2015-01-20  101.14  104.05  100.15  103.78   18838100     103.78   GILD
2015-01-27  112.42  112.48  109.03  109.14   91929200     109.14   AAPL
2015-01-26  113.74  114.36  112.80  113.10   55615000     113.10   AAPL
2015-01-23  112.30  113.75  111.53  112.98   46464800     112.98   AAPL
2015-01-22  110.26  112.47  109.72  112.40   53796400     112.40   AAPL
2015-01-21  108.95  111.06  108.27  109.55   48575900     109.55   AAPL
2015-01-20  107.84  108.97  106.50  108.72   49899900     108.72   AAPL
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Yes, that's exactly what i wanted to do: but when i paste your code and try it i get this result: File "", line 15 ^ SyntaxError: invalid syntax why? i'm using your instruction, and i'm using pandas 0.15.1 – Andrea Jan 28 '15 at 07:58
  • This line `` is not code, it's the output from `df.info()` so ignore it, the key bits of code for you are: `symbols=[] for ticker in symbols_list: r = DataReader(ticker, "yahoo", '2015-01-20') # add a symbol column r['Symbol'] = ticker symbols.append(r) # concatenate all the dfs df = pd.concat(symbols)` and `df.sort(ascending=False)` – EdChum Jan 28 '15 at 08:40
  • what an idiot: it work just fine now, i am really sorry about that. One question: why when i write df.sort(ascending=False) the program will not just invert the order of the dates, but he creates a group of tickers? when you posted your code before sort (ascending=false) the program had an order that i needed(AAPL, AAPL, and so on...). But after that line of code, the symbol list is not grouped by ticker, but date. (GILD,TSLA,GOOG,YHOO and so on...it is not grouped by ticker) – Andrea Jan 28 '15 at 09:25
  • @Andrea if you want to sort by the ticker name and then date you can just pass multiple columns like so: `df.sort(['Symbol', df.index], ascending=False)` I think should work, – EdChum Jan 28 '15 at 19:53
  • @Andrea I've updated my answer to satisfy your last requirement – EdChum Jan 28 '15 at 20:00
  • Hi EdChum, I've edited your precious code to fit my needs, and it worked just perfect. Thank you for all your patience, your professionality, and all your help. Much oblidged. I'll post an update with all my code completed with your work – Andrea Jan 29 '15 at 09:21