0

I have created the follwing dictionary:

for k, er in dicio.items():
    #dicio[k]['Return %'] = er.iloc[:, 0].pct_change(-1)*100
    dicio[k]['Day'] = er.index.day
dicio

 {'WDOFUT':             WDOFUT  Day
 Data                   
 2020-09-11  5325.0   11
 2020-09-10  5325.0   10
 2020-09-09  5312.5    9
 2020-09-08  5366.0    8
 2020-09-04  5303.0    4
 ...            ...  ...
 1994-07-08     NaN    8
 1994-07-07     NaN    7
 1994-07-06     NaN    6
 1994-07-05     NaN    5
 1994-07-04     NaN    4
 
 [6482 rows x 2 columns],
 'WEGE3':             WEGE3  Day
 Data                  
 2020-09-11  62.42   11
 2020-09-10  62.42   10
 2020-09-09  64.93    9
 2020-09-08  63.00    8
 2020-09-04  64.49    4
 ...           ...  ...
 1994-07-08    NaN    8
 1994-07-07    NaN    7
 1994-07-06    NaN    6
 1994-07-05    NaN    5
 1994-07-04    NaN    4
 
 [6482 rows x 2 columns],
 'YDUQ3':             YDUQ3  Day
 Data                  
 2020-09-11  27.31   11
 2020-09-10  27.31   10
 2020-09-09  27.99    9
 2020-09-08  28.75    8
 2020-09-04  27.78    4
 ...           ...  ...
 1994-07-08    NaN    8
 1994-07-07    NaN    7
 1994-07-06    NaN    6
 1994-07-05    NaN    5
 1994-07-04    NaN    4
 
 [6482 rows x 2 columns]}

I can group by day, but it is only taking the last item of the dictionary (YDUQ3):

grouped_by_day = dicio[k].groupby('Day')
grouped_by_day.describe()

YDUQ3
count   mean    std min 25% 50% 75% max
Day                             
1   86.0    13.974651   9.391865    2.96    5.4450  11.770  21.2000 39.75
2   95.0    15.022842   10.624683   2.57    5.6900  13.290  21.4050 49.19
3   102.0   15.262549   11.061839   2.44    5.8950  12.800  21.8575 53.85
              ................................................
29  96.0    14.498229   10.321219   2.61    5.4150  12.975  21.0425 50.88
30  92.0    14.914674   10.701043   2.61    5.5125  13.120  21.7150 51.32
31  51.0    15.339608   10.676544   2.96    6.1350  13.420  21.7150 51.73

I can see the daily-grouped dictionary displayed below, but only for the last item (I need all):

list(grouped_by_day)

[(1,
              YDUQ3  Day
  Data                  
  2020-09-01  27.89    1
  2020-07-01  34.41    1
  2020-06-01  29.82    1
  2020-04-01  21.30    1
  2019-11-01  39.75    1
  ...           ...  ...
  1995-02-01    NaN    1
  1994-12-01    NaN    1
  1994-11-01    NaN    1
  1994-09-01    NaN    1
  1994-08-01    NaN    1      
  [182 rows x 2 columns]),
   ......................
   ......................
  (31,
              YDUQ3  Day
  Data                  
  2020-08-31  26.95   31
  2020-07-31  33.89   31
  2020-03-31  21.76   31
  2020-01-31  51.73   31
  2019-10-31  38.52   31
  ...         ...    ...
  1995-05-31    NaN   31
  1995-03-31    NaN   31
  1995-01-31    NaN   31
  1994-10-31    NaN   31
  1994-08-31    NaN   31
  
  [113 rows x 2 columns])]

Question:

  • How can I get the 3 items of the dictionary displayed? (dicio[k] is taking only one key (last one))

  • I would like to add up Return % for all same days together.

    • If 10 year span there will be ~120 days 01, ~120 days 02 and so on.

    • Each symbol will have a 31 x ~120 dictionary where we can select the highest day of cumulative return and the lowest day of cumulative return.

    • Then I would like to display the entire portfolio of stocks highest/lowest returns and their days of occurrence.

1 Answers1

0

From the details of your question, I am not sure but from the framing of your question, it seems you have a separate data frame for each stock. If that is the case, you might try to combine them all into a single data frame. I put together this example to illustrate what I mean.

  import pandas as pd
  import numpy as np
  dicio =  {
      'WDOFUT': [              
   [pd.Timestamp(year=2020, month= 9, day= 11),  5325.0, 11],
   [pd.Timestamp(year=2020, month= 9, day= 10),  5325.0, 10],
   [pd.Timestamp(year=2020, month= 9, day= 9),  5312.5, 9],
   [pd.Timestamp(year=2020, month= 9, day= 8),  5366.0, 8],
   [pd.Timestamp(year=2020, month= 9, day= 4),  5303.0, 4],
   [pd.Timestamp(year=1994, month= 7, day= 8),  np.nan,  8],
   [pd.Timestamp(year=1994, month= 7, day= 7),  np.nan, 7],
   [pd.Timestamp(year=1994, month= 7, day= 6),  np.nan, 6],
   [pd.Timestamp(year=1994, month= 7, day= 5),  np.nan,  5],
   [pd.Timestamp(year=1994, month= 7, day= 4),  np.nan, 4],],
      'WEGE3': [
   [pd.Timestamp(year=2020, month=9, day= 11),  62.42, 11],
   [pd.Timestamp(year=2020, month=9, day= 10),  62.42, 10],
   [pd.Timestamp(year=2020, month=9, day= 9),  64.93,  9],
   [pd.Timestamp(year=2020, month=9, day= 8), 63.00,  8],
   [pd.Timestamp(year=2020, month=9, day= 4),  64.49,  4],
   [pd.Timestamp(year=1994, month=7, day= 8), np.nan,  8],
   [pd.Timestamp(year=1994, month=7, day= 7), np.nan,  7],
   [pd.Timestamp(year=1994, month=7, day= 6), np.nan, 6],
   [pd.Timestamp(year=1994, month=7, day=5), np.nan,  5],
   [pd.Timestamp(year=1994, month=7, day=4), np.nan,  4]
   ],
      'YDUQ3':[                  
   [pd.Timestamp(year=2020, month=9, day= 11),  27.31,   11],
   [pd.Timestamp(year=2020, month=9, day= 10),  27.31,    10],
   [pd.Timestamp(year=2020, month=9, day= 9),  27.99,    9],
   [pd.Timestamp(year=2020, month=9, day= 8),  28.75,    8],
   [pd.Timestamp(year=2020, month=9, day= 4),  27.78,   4],
   [pd.Timestamp(year=1994, month=7, day= 8), np.nan,   8],
   [pd.Timestamp(year=1994, month=7, day= 7), np.nan,  7],
   [pd.Timestamp(year=1994, month=7, day= 6), np.nan,   6],
   [pd.Timestamp(year=1994, month=7, day= 5), np.nan,  5],
   [pd.Timestamp(year=1994, month=7, day= 4), np.nan,  4]],
   }
   data_list = []
   for stk in dicio.keys():
      for itm in dicio[stk]:
          dline =[stk]
          dline.extend(itm)
          data_list.append(dline)  
   df = pd.DataFrame(data= data_list, columns= ['Stock','Date', 'Return','Day'])
   grouped_by_day = df.groupby(by=['Day','Stock']).mean()
    

A printout of grouped_by_day yields:

             
Day Stock   Return
4   WDOFUT  5303.00
    WEGE3   64.49
    YDUQ3   27.78
5   WDOFUT  NaN
    WEGE3   NaN
    YDUQ3   NaN
6   WDOFUT  NaN
    WEGE3   NaN
    YDUQ3   NaN
7   WDOFUT  NaN
    WEGE3   NaN
    YDUQ3   NaN
8   WDOFUT  5366.00
    WEGE3   63.00
    YDUQ3   28.75
9   WDOFUT  5312.50
    WEGE3   64.93
   YDUQ3    27.99
10  WDOFUT  5325.00
    WEGE3   62.42
    YDUQ3   27.31
11  WDOFUT  5325.00
    WEGE3   62.42
    YDUQ3   27.31

I think you should be able to derive the results you are looking for from this group_by result.

itprorh66
  • 3,110
  • 4
  • 9
  • 21
  • Sorry, it´s not what I am looking for. – Daniel Bittencourt Sep 24 '20 at 10:46
  • What is it you are trying to achieve? How does your desired result differ from the proposed solution? – itprorh66 Sep 24 '20 at 12:32
  • There will be multiple stock symbols let´s say spanning from 2010 to 2020. I calculate the daily return for each day for each symbol. Then, for each stock symbol there will be ~120 days 01, ~120 days 02 and so on. I would like to know for each day of the month the cumulative return. Let´s say the stock WEGE3 has it´s highest cumulative sum of returns among 31 days on day 03 and it´s lowest cumulative return on day 19. Ask more if needed. – Daniel Bittencourt Sep 24 '20 at 13:07
  • You can refer to this post: https://stackoverflow.com/questions/63901027/date-is-not-working-even-when-date-column-is-set-to-index ....... here mr. Trenton McKinney arrived at the result by using the return without considering the day. I would like to find the best and worst days for each symbol. – Daniel Bittencourt Sep 24 '20 at 13:15