I have a Dataframe with the location of some customers (so I have a column with Customer_id and others with Lat and Lon) and I am trying to interpolate the NaN's according to each customer.
For example, if I interpolate with the nearest approach here (I made up the values here):
Customer_id Lat Lon
A 1 1
A NaN NaN
A 2 2
B NaN NaN
B 4 4
I would like the NaN for B to be 4 and not 2.
I have tried this
series.groupby('Customer_id').apply(lambda group: group.interpolate(method = 'nearest', limit_direction = 'both'))
And the number of NaN's goes down from 9003 to 94. But I'm not understanding why it is still leaving some missing values.
I checked and these 94 missing values corresponded to records from customers that were already being interpolated. For example,
Customer_id Lat
0. A 1
1. A NaN
2. A NaN
3. A NaN
4. A NaN
It would interpolate correctly until some value (let's say it interpolates 1, 2 and 3 correctly) and then leaves 4 as NaN.
I have tried to set a limit in interpolate greater than the maximum number of records per client but it is still not working out. I don't know where my mistake is, can somebody help out?
(I don't know if it's relevant to mention or not but I fabricated my own NaN's for this. This is the code I used Replace some values in a dataframe with NaN's if the index of the row does not exist in another dataframe I think the problem isn't here but since I'm very confused as to where the issue actually is I'll just leave it here)