1

I have a DataFrame that looks like this:

data = pd.DataFrame({'id' : ['a1', 'a1', 'a1', 'a2','a2','a3','a3','a3'], 
                     'label' : ['0','0','1','0','1','0','1','1']})

My goal is to group by id and to do some arithmetics: to calculate the number of occurrence of '0' and '1' in each group, add 1000 to each number and, finally, to divide these numbers by each other. For example, for group 'a1': '0' occurs twice, '1' occurs once, then, 2 * 1000 / 1 * 1000 = 2.

The desired DataFrame should look like this:

id number
a1 2
a2 1
a3 0,5

These SO questions helped me a lot:

Group by two columns and count the occurrences of each combination in pandas

pandas groupby count string occurrence over column

I've tried it out in different variations but still didn't reach the desired output. Any help would be very approciated.

Ekaterina
  • 195
  • 3
  • 11

3 Answers3

3

groupby then value_counts, and we do pct_change

data.groupby('id').label.apply(lambda x : x.value_counts(sort=False).pct_change()+1).dropna().reset_index(level=0)
Out[405]: 
   id  label
0  a1    2.0
0  a2    1.0
0  a3    0.5

Or

pd.crosstab(data.id,data.label).assign(v=lambda x : x['0']/x['1'])
Out[414]: 
label  0  1    v
id              
a1     2  1  2.0
a2     1  1  1.0
a3     1  2  0.5
BENY
  • 317,841
  • 20
  • 164
  • 234
2
data.groupby('id').label.apply(lambda x: (x == '0').sum()/(x == '1').sum())

Output:

id
a1    2.0
a2    1.0
a3    0.5
Name: label, dtype: float64
ALollz
  • 57,915
  • 7
  • 66
  • 89
  • You will get a `RuntimeWarning` if a group does not contain `'1'`, but will get `np.inf` as the value. – ALollz Jun 04 '18 at 21:39
1

You can use collections.Counter and explicitly define the result you require in the case no 1 value exists in your group.

from collections import Counter

def calculator(x):
    c = Counter(x)
    try:
        return c['0'] / c['1']
    except ZeroDivisionError:
        return np.nan

res = data.groupby('id')['label'].apply(calculator)

id
a1    2.0
a2    1.0
a3    0.5
Name: label, dtype: float64
jpp
  • 159,742
  • 34
  • 281
  • 339