-1

I am looking to create new columns in Python that use existing data from the CSV file to create groups in the new column.

eg. i have a sample data that has the age and i need to create a new column that groups ages into 'Young' 'Adult' and 'Elder'

My code looks like this at the moment as i am using Pandas -

import pandas as pd
insurance = pd.read_csv ('insurance.csv')
print(insurance)
insurance['age_cat']= if age < 24: return 'Young' elif x < 55: return 'Adult' 
elif x >=56: return 'Elder' else: return 'other'

how would i do this?

theletz
  • 1,713
  • 2
  • 16
  • 22

2 Answers2

1

You can create a function and use apply method on the dataframe.

def f(age):
    if age < 24:
        return 'Young'
    elif age < 55:
        return 'Adult'
    elif age >= 56:
        return 'Elder'
    else:
        return 'other'


insurance['age_cat'] = insurance['age'].apply(f)
ywbaek
  • 2,971
  • 3
  • 9
  • 28
1

You can use pandas cut for this:

df['age_cat'] = pd.cut(df['age'], bins=[0,23,54,56, 999], labels=['Young', 'Adult', 'Elder','other'])
theletz
  • 1,713
  • 2
  • 16
  • 22