Substitute in column of dataframe if the integer values meet certain criteria

Question

Instead of having Age in numbers, I need to group them by certain age groups that get substituted on the data frame

import pandas as pd
# intialise data of lists. 
data = {'Name':['Tom', 'nick', 'krish', 'jack','Ann','James'], 
        'Age':[20, 21, 45, 58,34,60]} 
  
# Create DataFrame 
df = pd.DataFrame(data)

This is what I tried:

if df['Age'] < 20:
    df['Age']= df['Age'].replace([<20],'<20')

if df['Age'] >= 20 & >40:
    df['Age']= df['Age'].replace([>=20&<40],'>=20&<40')

if df['Age'] >=40:
    df['Age']= df['Age'].replace([>=40],'>=40')

`df['Age2'] = pd.cut(df['Age'], bins=[-np.inf, 20, 40, np.inf], labels=['<20', '20-40', '>=40'], right=False) ` will do it in a single line. — cs95, Jul 07 '20 at 21:23
Oops, forgot to add `right=False` param to my previous comment. But that will do it. Please consider upvoting the answer in the duplicate post if it helped. — cs95, Jul 07 '20 at 21:26

wwnde · Accepted Answer · 2020-07-07T21:29:52.140

1

use np.select(setofconditions, matchingchoices)

import numpy as np
c1=df['Age'] < 20
c2=df['Age'].between(20,40)
c3=df['Age'] >=40
cond=[c1,c2,c3]
choice=['<20','>=20&<40','>=40']
df['agerange']=np.select(cond,choice)

     Name  Age  agerange
0    Tom   20  >=20&<40
1   nick   21  >=20&<40
2  krish   45      >=40
3   jack   58      >=40
4    Ann   34  >=20&<40
5  James   60      >=40

edited Jul 07 '20 at 21:29

answered Jul 07 '20 at 21:24

wwnde

26,119
6
18
32

Substitute in column of dataframe if the integer values meet certain criteria

1 Answers1