Assuming that there is a pandas data frame with the rows containing some grouped data that are sorted (all the groups of values for a given name are appearing next to each other), we would like to introduce a new calculated column that assigns values depending on the values of some column. If the first value is zero, then all the values for a group get the first non-zero value or nan, if there is no such value. Otherwise, if the first value is non-zero, then a fixed value is assigned, for example -1
.
Example input data frame:
name value
0 a 0
1 a 0
2 a 6
3 a 8
4 b 0
5 b 0
6 c 5
7 c 7
Example output data frame with the calc
column created.
name value calc
0 a 0 6
1 a 0 6
2 a 6 6
3 a 8 6
4 b 0 nan
5 b 0 nan
6 c 5 -1
7 c 7 -1
The approach that I was thinking about was to create a lookup table of first non-zero values of each group, so for the example above it would be:
value
a 6
c 5
And then iterate the input data frame and construct the list of values following the logic above that would be then assigned to the new column.