0

I have df like like this.

import pandas as pd
import numpy as np
user = pd.DataFrame({'User':['101','101','101','102','102','101','101','102','102','102','102','102'],'Country':['India','Japan','India','Brazil','Japan','UK','Austria','Japan','Singapore','UK','UK','UK']
                    ,'Count':[85,78,70,5,6,8,60,30,5,6,5,4]})

i want to sort count column and assign assign top 30% rows to group 3, then next 30% to 2 and remaining 30% to group 1. How can i do that. this is my expected output. the first 4 columns. and also see my calculations how i divided 30%,30%,40%

want output

Kumar AK
  • 987
  • 3
  • 10
  • 23

1 Answers1

1

You need first sort columns by sort_values and then groupby with custom function with numpy.split and return length of each group to new row of new DataFrame:

Idea from perfect MaxU answer, thank you.


Use for top 30-30-30:

user = user.sort_values(['User','Count'], ascending=[True, False])

def f(x):
    #split to 4 groups, because 3 + 3 + 3 != 1 
    a, b, c, d = np.split(x, [int(.3*len(x)), int(.6*len(x)), int(.9*len(x))])
    return pd.Series([len(a), len(b), len(c)], index=['30','30','30'])

df = user.groupby('User').apply(f)
df['sum'] = df.sum(axis=1)
print (df)
      30  30  30  sum
User                 
101    1   2   1    4
102    2   2   2    6

and for 30-30-40:

user = user.sort_values(['User','Count'], ascending=[True, False])

def f(x):
    #split to 3 groups, because 3 + 3 + 4 == 1
    a, b, c = np.split(x, [int(.3*len(x)), int(.6*len(x))])
    return pd.Series([len(a), len(b), len(c)], index=['30','30','40'])

df = user.groupby('User').apply(f)
df['sum'] = df.sum(axis=1)
print (df)

      30  30  40  sum
User                 
101    1   2   2    5
102    2   2   3    7

EDIT:

Groups should be created by list comprehension:

def f(x):
    a, b, c = np.split(x.index, [int(.3*len(x)), int(.6*len(x))])
    L = [a,b,c]
    return [i for i, y in zip(range(len(L),0,-1) ,L) for j in y]

user['Groups'] = user.groupby('User')['User'].transform(f)
print (user)
   User    Country  Count  Groups
0   101      India     85       3
1   101      Japan     78       2
2   101      India     70       2
6   101    Austria     60       1
5   101         UK      8       1
7   102      Japan     30       3
4   102      Japan      6       3
9   102         UK      6       2
3   102     Brazil      5       2
8   102  Singapore      5       1
10  102         UK      5       1
11  102         UK      4       1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252