4

I'm quite new with pandas and need a bit help. I have a column with ages and need to make groups of these: Young people: age≤30 Middle-aged people: 30<age≤60 Old people:60<age Here is the code, but it gives me an error:

def get_num_people_by_age_category(dataframe):
    young, middle_aged, old = (0, 0, 0)
    dataframe["age"] = pd.cut(x=dataframe['age'], bins=[30,31,60,61], labels=["young","middle_aged","old"])
    return young, middle_aged, old
ages = get_num_people_by_age_category(dataframe) 
print(dataframe)
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
babavyna
  • 79
  • 1
  • 5
  • Please provide a [Provide a Minimal, Reproducible Example (e.g. code, data, errors) as text](https://stackoverflow.com/help/minimal-reproducible-example). [Create a reproducible copy of the DataFrame with `df.head(20).to_clipboard(sep=',')`](https://stackoverflow.com/questions/52413246/how-to-provide-a-copy-of-your-dataframe-with-to-clipboard), [edit] the question, and paste the clipboard into a code block. – Trenton McKinney Jul 07 '20 at 06:00
  • Do you think change `return young, middle_aged, old` to `return dataframe` ? – jezrael Jul 07 '20 at 06:01
  • As @jezrael said and you never do anything with `young, middle_aged, old = (0, 0, 0)` – Trenton McKinney Jul 07 '20 at 06:03
  • Hi, @jezrael - tried but same error message received...Btw, if I replace dataframe with income_data(which is the dataframe I work with), I receive another error message, which is => TypeError: '<' not supported between instances of 'int' and 'str' – babavyna Jul 07 '20 at 10:25
  • @babavyna - Is possible change `dataframe["age"] = pd.cut(x=dataframe['age'], bins=[30,31,60,61], labels=["young","middle_aged","old"])` to `dataframe["age"] = pd.cut(x=pd.to_numeric(dataframe['age'], errors='coerce'), bins=[30,31,60,61], labels=["young","middle_aged","old"])` ? – jezrael Jul 07 '20 at 10:28
  • @ jezrael , I tried with your suggestion and it works as well. Thanks – babavyna Jul 07 '20 at 21:40

1 Answers1

5

Code below gets the age groups using pd.cut().

# Import libraries
import pandas as pd

# Create DataFrame
df = pd.DataFrame({
    'age': [1,20,30,31,50,60,61,80,90] #np.random.randint(1,100,50)
})

# Function: Copy-pasted from question and modified
def get_num_people_by_age_category(df):
    df["age_group"] = pd.cut(x=df['age'], bins=[0,30,60,100], labels=["young","middle_aged","old"])
    return df

# Call function
df = get_num_people_by_age_category(df)

Output

print(df)

   age    age_group
0    1        young
1   20        young
2   30        young
3   31  middle_aged
4   50  middle_aged
5   60  middle_aged
6   61          old
7   80          old
8   90          old
Nilesh Ingle
  • 1,777
  • 11
  • 17