I have a pandas dataframe where half of the rows (where buffer_id = 'off') have missing values (NaNs and -9999) in three of the columns. I'm trying to fill those missing values using the corresponding values from the other half (where buffer_id = 'on') based on a matching site ID.
Here is a reproducible example:
df = pd.DataFrame({'site_id': ['A', 'B', 'C', 'A', 'B', 'C'],
'buffer_id': ['on', 'on', 'on', 'off', 'off', 'off'],
'easting': [111, 222, 333, 'NaN', 'NaN', 'NaN'],
'northing': [444, 555, 666, 'NaN', 'NaN', 'NaN'],
'year': [1990, 1995, 2000, -9999, -9999, -9999],
'ndvi': [12, 22, 32, 42, 52, 62]})
So the missing values in 'easting', 'northing', and 'year' should be filled with the values from the rows with corresponding site ID.
How would you go about doing that?
Thank you.