I have the following dataframe:
id number
1 13
1 13
1 NaN
1 NaN
2 11
2 11
2 11
2 NaN
I want to find the first non-NaN value per id and mark it with a 1. The result should look like this:
id number code
1 13 NaN
1 13 1
1 NaN NaN
1 NaN NaN
2 11 NaN
2 11 NaN
2 11 1
2 NaN NaN
I tried the following command and then go from there:
df["test"] = df.groupby("id")["number"].first_valid_index()
It gives me the following error: Cannot access callable attribute 'first_valid_index' of 'SeriesGroupBy' objects, try using the 'apply' method
Then I tried this:
df['test'] = df.groupby("id")['number'].apply(lambda x: x.first_valid_index())
But this gives me just a column of Nats...
Does anybody know how the problem could be solved efficiently?