0

My dataframe:

      Date       AAPL      NFLX       INTC    AAPL_Ret   NFLX_Ret   INTC_Ret
0 2008-01-31  27.834286  3.764286  25.350000       
1 2008-02-29  27.847143  3.724286  24.670000   -0.07     0.25       -0.05     
2 2008-03-31  25.721428  3.515714  22.670000    0.15     0.10       0.06  
3 2008-04-30  25.377142  3.554286  22.879999    etc
4 2008-05-31  24.464285  3.328571  22.260000

After resampling my original dataframe I calculated the returns for each stock (see above). Now I'd like to calculate (i) percentiles for the returns across stocks, i.e., going across the row (essentially comparing a stock to others) and not down the column, call that "Relative Percentile" say (ii) percentiles for each stock's own returns, going down the column, call that "Absolute Percentile"

I was shown how to select the _Ret columns to do the quantile calculation but am stuck at going from selecting the _Ret columns to create new columns with the relative quantile and absolute quantiles.

Have tried this:

df['Quantile'] = df[[col for col in df.columns if "_Ret" in col]].quantile(0.9)

But this returns a blank "Quantile" column in between AAPL_Ret and NFLX_Ret

How to do this? Thanks

Eksana Stasis
  • 173
  • 1
  • 1
  • 7
  • use list comprehension to generate a list of the columns you want: `[col for col in df.columns if '_Ret' in col]` and then subset the `DataFrame` by those columns when you calculate the quantiles. – ALollz Jun 23 '18 at 17:32
  • so you mean something like: df [col for col in df.columns if '_Ret' in col ].quantile(0.9) ? Not sure how to vary the call to do the relative comparison and the absolute comparison – Eksana Stasis Jun 23 '18 at 17:57
  • I tried: df['Quantile'] = df[[col for col in df.columns if "_Ret" in col]].quantile(0.9) but this only returns one quantile column between AAPL_Ret and NFLX_Ret and the column is empty – Eksana Stasis Jun 23 '18 at 18:01
  • Perhaps, that dups isn't enough... Quantiles accepts an axis parameter, the default is 0, which calculates the quantile for a column. The result is a series with the columns as the indices and the quantile for that column as the value. It doesn't really make sense to append this back to your DataFrame as a new column, where each row corresponds to a `Date`, not a ticker. On the other hand, the `.quantile(0.9, axis=1)` is the one you want to append back to your `DataFrame`, because it is the quantile for each `Date`. – ALollz Jun 23 '18 at 18:12
  • More information about the axis parameter: [what does axis in pandas mean](https://stackoverflow.com/questions/22149584/what-does-axis-in-pandas-mean) – ALollz Jun 23 '18 at 18:14
  • adding the axis=1 worked to return the 90th quantile acting on all the columns in each row. Now, how can I call quantile to run on all the rows in each column, i.e., as I understand, using axis =0? I've tried: df['Abs_Pctile'] = df[[col for col in df.columns if "_Ret" in col]].quantile(0.9, axis=0) but it appends a single blank column "Abs Pctile". I need to return something like a rolling 90th percentile quantile for all the rows in each column, for each stock. Thanks for your patience, I am a beginner and am probably asking all the wrong questions here – Eksana Stasis Jun 23 '18 at 18:48
  • Just do `df[[col for col in df.columns if "_Ret" in col]].quantile(0.9)`. (You can assign this to a different variable). Don't assign it back to a column. You are getting `NaN` because when you try to assign it to a column in your `DataFrame` it does so by index. However, the above quantile calculation has the tickers as the indices, while your original `DataFrame` does not, so you will get all `NaN` if you try to assign it to a column. The organization of your `df` and that result are totally different, so you probably want to store them separately. – ALollz Jun 23 '18 at 18:58

0 Answers0