I have a dataset with index, timestamp, and a value stored in three separate columns within a pandas data frame, e.g.:
I want to filter rows whose timestamp hours is equal to 23, and add a scalar to the values in the next column. How can I do this efficiently? The index column is not properly set in the dataset and I cannot rely on it.
Presently, I am using a for-loop to iterate over the rows, check if the hour in the timestamp is equal to 23, and modify the values in the corresponding cells, but it takes a lot of time. I tried to use the .groupby method suggested here as below, but that seems not to be working. It operates on the data two times, leaving the data unchanged and throwing SettingWithCopyWarning. Here is what I try. I am not sure if this is the best way to do it, though:
for index, data_slice in df.groupby(df["Date"].dt.hour == 23):
data_slice.loc["value"] += 1