Finding percentage of profitable investments with pandas groupby functionality

Question

I have a pandas DataFrame like this; it shows a history of stock investing. In the Profit column, 1 means profitable and 0 means loss-making.

Stock  Year   Profit  Count
 AAPL  2012    0       23
 AAPL  2012    1       19
 AAPL  2013    0       20
 AAPL  2013    1       10
GOOG   2012    0       26
GOOG   2012    1       20
GOOG   2013    0       23
GOOG   2013    1       11

I have to find out the percentage of profitable investments:

Stock  Year   Profit  CountPercent
 AAPL  2012    1       38.77
 AAPL  2013    1       33.33
GOOG   2012    1       43.47
GOOG   2013    1       32.35

I tried using the method in this post but it shows 'TypeError: Join on level between two MultiIndex objects is ambiguous'.

Please provide your code. What is `CountPercent` meant to be? — Ffisegydd, Nov 21 '14 at 10:16

score 2 · Accepted Answer · answered Nov 21 '14 at 10:32

I have loaded your data into a dataframe called "stocks".

# Get the count of profitable trades, indexed by stock+year:
count_profitable = stocks[ stocks['Profit']==1 ].set_index(['Stock','Year']).Count
# Get the count of all trades, indexed by stock + year:
count_all        = stocks.groupby(['Stock','Year']).Count.sum()
# Render nice percentages
pandas.options.display.float_format = '{:.2f}%'.format 
(count_profitable/count_all) * 100

This will produce:

Stock  Year
AAPL   2012   45.24%
       2013   33.33%
GOOG   2012   43.48%
       2013   32.35%
Name: Count, dtype: float64

unutbu · Answer 2 · 2014-11-21T11:49:17.310

2

You could use pivot_table:

In [38]: result = df.pivot_table(index=['Stock', 'Year'], columns='Profit', values='Count', aggfunc='sum')

In [39]: result['CountPercent'] = result[1]/(result[0]+result[1])

In [41]: result['CountPercent']
Out[41]: 
Stock  Year
AAPL   2012    0.452381
       2013    0.333333
GOOG   2012    0.434783
       2013    0.323529
Name: CountPercent, dtype: float64

edited Nov 21 '14 at 11:49

answered Nov 21 '14 at 11:43

unutbu

842,883
184
1,785
1,677

Alex Riley · Answer 3 · 2014-11-21T12:41:06.250

Assuming your DataFrame is in a consistent format (namely that 0 precedes 1 in the 'Profit' column), you could do the following groupby operations:

>>> grouped = df.groupby(['Stock', 'Year'])
>>> perc = grouped['Count'].last() / grouped['Count'].sum()
>>> perc.reset_index()
  Stock  Year     Count
0  AAPL  2012  0.452381
1  AAPL  2013  0.333333
2  GOOG  2012  0.434783
3  GOOG  2013  0.323529

This is just a normal DataFrame, so it should be straightforward to rename the 'Count' column, round it to two decimal places and add the 'Profit' column back in.

Finding percentage of profitable investments with pandas groupby functionality

3 Answers3