1

The following is a sample of my dataset:

index     time          value
 0       00:00:01        12
 1       00:00:06        18
 2       00:00:11        28
 3       00:00:15        32
 4       00:00:20        51

I would like to create a for Loop that does the following function:

for t in range (00:00:01, 23:59:59, 5s):
  if df.Value[t]> df.Value[t+10]:
      print ('True')
  else:
      print('False')

The for loop will start from time= 00:00:01 (1 sec, beginning of the day) to time=23:59:59 (end of day) with 5 second increment each time. So the loop will take the value at a certain time (t) and compare it with the value at time= t+ 10 seconds. and if it is larger, it will print True; otherwise, False.

P.S.: the tricky part is that the time difference between the points isn't always 5 seconds;therefore, I would like to write a function that allows the program to take the value of the closest time.

For example:

In the fourth iteration (at index 3), the loop will will take t= 00:00:16. However, this time isn't available in the dataset; therefore, I would like to call a function that will prompt the program to take the closest time to 00:00:16 which is 00:00:15.

Please note, that the for loop above is theoretical just to show what is needed. Also, the dataset is stored in Excel sheet.

I would appreciate any help. Thanks.

Alex Davies
  • 191
  • 2
  • 11
  • It sounds like what you really need is to check a time range. In your fourth iteration, your time range would be `t>00:00:11` and `t<=00:00:16`, correct? – Robert Harvey Mar 06 '19 at 18:31
  • @AlexDavies how is the rounding supposed to be done? Always round up? Always down? Are the dates garanteed to be sorted in the dataframe, because otherwise the rounding is tricky...? – Ralf Mar 06 '19 at 18:55
  • @Ralf yes, the dates are sorted in the dataframe. the rounding can be up or down depending on the time. For example, if the loop takes t= 00:00:28, and there is only t=00:00:27 and t=00:00:30. Therefore it will be rounded down. – Alex Davies Mar 06 '19 at 19:03
  • @RobertHarvey No, the program should take the value at a certain time and compare it with the value at time = time+ 10 s (the value after 10 seconds). For example, in the first iteration the for loop should compare between 12 and 28. – Alex Davies Mar 06 '19 at 19:05
  • what is the datatype of the `time` column of the dataframe? `print(df.time)` should give you that information – Ralf Mar 06 '19 at 19:05
  • @Ralf It showed me an error. However, in the excel sheet the time is (e.g.): 1/1/2012 12:00:06 AM. – Alex Davies Mar 06 '19 at 19:10
  • Will there be gaps in the data or can you always count on having a data point approximately every 5 seconds? What should you do when there's a gap? – Mark Ransom Mar 06 '19 at 23:58
  • @MarkRansom yes, there is a 5 seconds difference from the beginning of the day till 23:59:59. However, there is 3,4, or 6 seconds difference between the data points and here where I would like to call the function to round the time to the closest time in the data. Please let me know if it isn't clear anymore. Thanks – Alex Davies Mar 07 '19 at 00:01
  • Don't bother rounding then, just iterate through your data and take the time directly from it. – Mark Ransom Mar 07 '19 at 00:13
  • @MarkRansom Can you please elaborate, if I don't round, it will show me an error since the time that the for loop will take isn't available in the dataset. My dataset is more random than what is shown above, but the idea is the same. I also don't know how to iterate with for loop using time instead of indeces. Would appreciate your help if you can. – Alex Davies Mar 07 '19 at 00:16
  • Would it be easier to iterate through the full dataset, remember the last ten seconds, and compare to "ten seconds ago"? If you look at times as they fall out the back of that window, you should be able to get the identical result, without having to look ahead. – Kenny Ostrom Mar 07 '19 at 03:46
  • @KennyOstrom thanks for your comment. Sorry but I amn't sure I get your point. I hope you didn't misunderstand my question. One thing more is that I do have to do more than one comparison and sometimes i will have to compare the value at certain time with a future value. – Alex Davies Mar 07 '19 at 03:53

1 Answers1

0

If you plan to run this function more than once, it is relatively cheap to create an optimized version instead - a simple binary search (on sorted dates) is still likely to outperform your linear search significantly, even if the dates have strange gaps.

from list of integers, get number closest to a given value

If you really need to be efficient, there are other trees you could construct on the dates to give more efficient searches

Cireo
  • 4,197
  • 1
  • 19
  • 24
  • Thanks for your answer, but can you please elaborate more? I know that binary search is for searching a value in a list but not for rounding values. – Alex Davies Mar 07 '19 at 22:39
  • If you look at the linked SO question in the middle of my answer, you will see several approaches. In particular, the "standard" binary search might have a failure condition ("not found") when the upper and lower bounds are reduced to nothing. In your case, you just take the one of the two boundaries that is closer to your number – Cireo Mar 07 '19 at 23:53