Grouping numerical values in categories

Question

I have numeric data within Student marks and I would like to group them into 3 categories A, B and C.

df = pd.DataFrame([('Adel',  3.5),
                   ('Betty',  2.75),
                   ('Djamel',  2.10),
                   ('Ramzi',  1.75),
                   ('Alexa', 3.15)],
                  columns=('Name', 'GPA'))

I tried function pd.cut() but it didn't lead to wanted result .

score 1 · Accepted Answer · answered Jan 12 '20 at 07:56

1

Here's a way using pd.cut:

df = df.sort_values('GPA')

df['bins'] = pd.cut(df['GPA'], bins=3, labels = ['A','B','C'])

     Name   GPA bins
3   Ramzi  1.75    A
2  Djamel  2.10    A
1   Betty  2.75    B
4   Alexa  3.15    C
0    Adel  3.50    C

answered Jan 12 '20 at 07:56

YOLO

20,181
5
20
40

score 1 · Answer 2 · answered Jan 13 '20 at 09:59

In a recent research, a PSO was implemented to classify students under unknown number of groups. PSO showed improved capabilities compared to GA. I think that all you need is the specific research.

The paper is: Forming automatic groups of learners using particle swarm optimization for applications of differentiated instruction

You can find the paper here: https://doi.org/10.1002/cae.22191

Perhaps the researchers could guide you through researchgate: https://www.researchgate.net/publication/338078753

You just need to remove the technic from automatic number of groups

SAM.Am · Answer 3 · 2020-01-12T08:53:57.247

0

I found this solution :

import pandas as pd, numpy as np

df = pd.DataFrame({'GPA': [99, 53, 71, 84, 84],
                   'Name': ['Betty', 'Djamel', 'Ramzi', 'Alexa', 'Adel']})

bins = [0, 50, 60, 70, 80, 100]
names = ['F', 'D', 'C', 'B', "A"]

d = dict(enumerate(names, 1))

df['Rank'] = np.vectorize(d.get)(np.digitize(df['GPA'], bins))

thanks to this link here.

edited Jan 12 '20 at 08:53

answered Jan 12 '20 at 08:38

SAM.Am

187
14

Grouping numerical values in categories

3 Answers3