I have a dataframe that I have made by concatenating several dataframes containing various bits of information. I am then creating new columns in the dataframe with the result of operations upon the columns of data. one of the columns ['MSD'] contains a NaN at the junction between each of the dataframes that contributed to the concatenation.
I wish to create a new column MSD_cum
where each value is the cumulative sum of MSD
but I want the cumulative sum to begin anew from each NaN`.
There is a question about this already but although the code in the answers given to it execute without errors, none of them actually do what I want ie.
MSD
1
2
3
4
NaN
1
2
3
would create output
MSD_cum
1
3
6
10
NaN
1
3
6
Instead, they yield either the same as the input or
my own initial attempts which also failed and yielded the same output as the input, attempted to use an if statement to make the cumsum conditional upon the difference between the data indices.
if np.diff(data_230['time']) is 1:
data_230['MSD_cum'] = cumsum(data_230['MSD'])
else:
data_230['MSD_cum'] = data_230['MSD']
I would be happy with any approach at this point.