0

enter image description hereI have a data frame X which will always have zeros to start with and ends with zeroes so I am performing the .diff() function on the sun column to get the difference of the current interval with its previous interval and when I do that I get this big values at the start of the day and at the end of the day marked in yellow color in data frame Y, I am trying to see how to calculate the difference from the 3:30 time stamp so that we get a data frame z where we have zero instead of 100 and -142

Krish
  • 67
  • 5

1 Answers1

1

If no zeroes in valid data range:

df.loc[~df['sun'].eq(0), 'sun'].diff().fillna(0).reindex(df.index, fill_value=0)

Output:

2020-07-20 03:05:00     0.0
2020-07-20 03:10:00     0.0
2020-07-20 03:15:00     0.0
2020-07-20 03:20:00     0.0
2020-07-20 03:25:00     0.0
2020-07-20 03:30:00    21.0
2020-07-20 03:35:00     1.0
2020-07-20 03:40:00    12.0
2020-07-20 03:45:00   -12.0
2020-07-20 03:50:00    20.0
2020-07-20 03:55:00     0.0
2020-07-20 04:00:00     0.0
2020-07-20 04:05:00     0.0
Freq: 5T, Name: sun, dtype: float64

Otherwise lets find the start and end of valid data range:

s = df.where(df['sun'].ne(0))
idx_start = s.first_valid_index()
idx_end = s.last_valid_index()
df.loc[idx_start:idx_end].diff().fillna(0).reindex(df.index, fill_value=0)

Output:

                      sun
2020-07-20 03:05:00   0.0
2020-07-20 03:10:00   0.0
2020-07-20 03:15:00   0.0
2020-07-20 03:20:00   0.0
2020-07-20 03:25:00   0.0
2020-07-20 03:30:00  21.0
2020-07-20 03:35:00   1.0
2020-07-20 03:40:00  12.0
2020-07-20 03:45:00 -12.0
2020-07-20 03:50:00  20.0
2020-07-20 03:55:00   0.0
2020-07-20 04:00:00   0.0
2020-07-20 04:05:00   0.0
Scott Boston
  • 147,308
  • 15
  • 139
  • 187