I want to get the information in which row the value 1
occurs last for each column of my dataframe. Given this last row index I want to calculate the "recency" of the occurence. Like so:
>> df = pandas.DataFrame({"a":[0,0,1,0,0]," b":[1,1,1,1,1],"c":[1,0,0,0,1],"d":[0,0,0,0,0]})
>> df
a b c d
0 0 1 1 0
1 0 1 0 0
2 1 1 0 0
3 0 1 0 0
4 0 1 1 0
Desired result:
>> calculate_recency_vector(df)
[3,1,1,None]
The desired result shows for each column "how many rows ago" the value 1
appeared for the last time. Eg for the column a
the value 1
appears last in the 3rd-last row, hence the recency of 3
in the result vector. Any ideas how to implement this?
Edit: to avoid confusion, I changed the desired output for the last column from 0
to None
. This column has no recency because the value 1
does not occur at all.
Edit II: Thanks for the great answers! I have to calculate this recency vector approx. 150k times on dataframes shaped (42,250). A more efficient solution would be much appreciated.