I have a large panel data in a pandas dataframe. The example data can be found here:
import pandas as pd
df = pd.read_csv('example_data.csv')
df.head()
ID Year y DOB Year_of_death event
223725 1991 6 1975.0 2021 No
223725 1992 6 1975.0 2021 No
223725 1993 6 1975.0 2021 No
223725 1994 6 1975.0 2021 No
223725 1995 6 1975.0 2021 No
I want to change the values in the column event
so that if the
Year
value corresponds to the Year_of_death
value then the observation in event
for that specific row or ID
changes to Yes
, otherwise it remains as No
.
For example, ID
68084329 died in 2012 but has the value Yes
in every observation in the column event
. I want to change it so that only the row with Year
2012 for this ID
has Yes
in event
. The other event
values should remain as No
.
df.loc[df['ID'] == '68084329']
ID Year y DOB Year_of_death event
68084329 1991 6 1942.0 2012 Yes
68084329 1992 5 1942.0 2012 Yes
68084329 1993 5 1942.0 2012 Yes
68084329 1994 6 1942.0 2012 Yes
68084329 1995 6 1942.0 2012 Yes
68084329 1996 5 1942.0 2012 Yes
68084329 1997 6 1942.0 2012 Yes
68084329 1998 5 1942.0 2012 Yes
68084329 1999 6 1942.0 2012 Yes
68084329 2000 6 1942.0 2012 Yes
68084329 2001 6 1942.0 2012 Yes
68084329 2002 5 1942.0 2012 Yes
68084329 2003 6 1942.0 2012 Yes
68084329 2004 5 1942.0 2012 Yes
68084329 2005 5 1942.0 2012 Yes
68084329 2006 6 1942.0 2012 Yes
68084329 2007 6 1942.0 2012 Yes
68084329 2008 6 1942.0 2012 Yes
68084329 2010 5 1942.0 2012 Yes
68084329 2011 5 1942.0 2012 Yes
68084329 2012 0 1942.0 2012 Yes
How do I make these changes for a large DataFrame with many IDs in accordance with the above conditions?