1

I have this dataset where I would like to calculate the age:

Name       DOB 
John      1995-12-04
James     1997-10-01
Jacoob    1997-08-30
Hansard   1995-03-12
Yusoft    1992-12-12
Henry     1993-02-12

I have tried this code:

now = pd.Timestamp('now')
df['age'] = (now - df['DOB'])

but I get this error:

*A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead*

i also tried df.loc[df['DOB']] but didnt work

tan
  • 11
  • 2
  • What is `CustomerDemographic`? – Scott Hunter Dec 13 '21 at 14:23
  • what do you get after `CustomerDemographic.dtypes` ? – vojtam Dec 13 '21 at 14:24
  • guess, the CustomerDemographic is just a slice of other dataframe. You should assign value to original dataframe instead the slice one. Or assign the value to a new dataframe.. Just try assign https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.assign.html – Yong Wang Dec 13 '21 at 14:33
  • The error is in fact not really in this line but on a previous one in your code. Check https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas – Corralien Dec 13 '21 at 14:42

3 Answers3

1

Use year property to compute age:

now = pd.Timestamp('now')

CustomerDemographic['DOB'] = pd.to_datetime(CustomerDemographic['DOB']
CustomerDemographic['age'] = now.year - CustomerDemographic['DOB'].dt.year - (now.dayofyear < CustomerDemographic['DOB'].dt.dayofyear)
print(CustomerDemographic)

# Output:
      Name        DOB  age
0     John 1995-12-04   26
1    James 1997-10-01   24
2   Jacoob 1997-08-30   24
3  Hansard 1995-03-12   26
4   Yusoft 1992-12-12   29
5    Henry 1993-02-12   28
Corralien
  • 109,409
  • 8
  • 28
  • 52
0

I came up with a slightly different solution to your case, don't mind the data reading. This might be helpful for others...

import pandas as pd
from datetime import datetime, date

# This function converts given date to age
def age(born):
    born = datetime.strptime(born, "%Y-%m-%d").date()
    today = date.today()
    return today.year - born.year - ((today.month, 
                                      today.day) < (born.month, 
                                                    born.day))
  
df = pd.read_csv("data.csv", sep='\s+')
df['age'] = df['DOB'].apply(age)
print(df)

And the output is

      Name         DOB  Age
0     John  1995-12-04   26
1    James  1997-10-01   24
2   Jacoob  1997-08-30   24
3  Hansard  1995-03-12   26
4   Yusoft  1992-12-12   29
5    Henry  1993-02-12   28
Gabriel Pellegrino
  • 1,042
  • 1
  • 8
  • 17
0

There are many ways to calculate the age, but here is a short one:

from datetime import datetime
now = datetime.now()
CustomerDemographic['age'] = CustomerDemographic['DOB'].apply(lambda x: int((now - datetime.strptime(x, '%Y-%m-%d')).days/365.25))