I have DataFrame in view of Name
and Date
with values of weight in cells :
Name Jan17 Jun18 Dec18 Apr19 count
Nick 0 1.7 3.7 0 2
Jack 0 0 2.8 3.5 2
Fox 0 1.7 0 0 1
Rex 1.0 0 3.0 4.2 3
Snack 0 0 2.8 4.4 2
Yosee 0 0 0 4.3 1
Petty 0.5 1.3 2.8 3.5 4
Start
and Finish
should be added to the dataFrame in reference to the next definition:
Start
first non zero value in row started fromJan17
column toApr19
Finish
first non zero value in sequenceApr19
till toJan17
Also, if row has only one non-zero value in row then Start
andFinish
are the same.
To find first non zero element in row I tried data[col].keys, np.argmax()
and it works as expected.
date_col_list = ['Jan17','Jun18','Dec18', 'Apr19']
data['Start']=data[date_col_list].keys([np.argmax(data[date_col_list].values!=0, axis=1)]
Result is:
Name Jan17 Jun18 Dec18 Apr19 count Start
Nick 0 1.7 3.7 0 2 Jun18
Jack 0 0 2.8 3.5 2 Dec18
Fox 0 1.7 0 0 1 Jun18
Rex 1.0 0 3.0 4.2 3 Jan18
Snack 0 0 2.8 4.4 2 Dec18
Yosee 0 0 0 4.3 1 Apr19
Petty 0.5 1.3 2.8 3.5 4 Jan17
To detect values for Finish
column I tried to use:
np.apply_along_axis
as:
def func_X(i):
return np.argmax(np.where(i!=0))
np.apply_along_axis(func1d = func_X, axis=1, arr=data[date_col_list].values)
Result is error:
'tuple' object has no attribute 'argmax'
Expected dataframe is:
Name Jan17 Jun18 Dec18 Apr19 count Start Finish
Nick 0 1.7 3.7 0 2 Jun18 Dec18
Jack 0 0 2.8 3.5 2 Dec18 Apr19
Fox 0 1.7 0 0 1 Jun18 Jun18
Rex 1.0 0 3.0 4.2 3 Jan18 Apr19
Snack 0 0 2.8 4.4 2 Dec18 Apr19
Yosee 0 0 0 4.3 1 Apr19 Apr19
Petty 0.5 1.3 2.8 3.5 4 Jan17 Apr19
How can I find Finish
in reference to non-zero value in direction from the last column (Apr19
) to the first one (Jan17
)?