10

Given this base date:

base_date = "10/29 06:58 AM"

I want to find a tuple within the list that contains the closest date to the base_date, but it must not be an earlier date.

list_date = [('10/30 02:18 PM', '-103', '-107'), ('10/30 02:17 PM', '+100', '-110'), \
             ('10/29 02:15 AM', '-101', '-109') 

so here the output should be ('10/30 02:17 PM', '+100', '-110') (it can't be the 3rd tuple because the date there happened earlier than the base date)

My question is, does it exist any module for such date comparison? I tried to first change the data all to AM format and then compare but my code gets ugly with lots of slicing.

@edit:

Big list to test:

[('10/30 02:18 PM', '+13 -103', '-13 -107'), ('10/30 02:17 PM', '+13 +100', '-13 -110'), ('10/30 02:15 PM', '+13 -101', '-13 -109'), ('10/30 02:14 PM', '+13 -103', '-13 -107'), ('10/30 01:59 PM', '+13 -105', '-13 -105'), ('10/30 01:46 PM', '+13 -106', '-13 -104'), ('10/30 01:37 PM', '+13 -105', '-13 -105'), ('10/30 01:24 PM', '+13 -107', '-13 -103'), ('10/30 01:23 PM', '+13 -106', '-13 -104'), ('10/30 01:05 PM', '+13 -103', '-13 -107'), ('10/30 01:02 PM', '+13 -104', '-13 -106'), ('10/30 12:55 PM', '+13 -103', '-13 -107'), ('10/30 12:51 PM', '+13.5 -110', '-13.5 +100'), ('10/30 12:44 PM', '+13.5 -108', '-13.5 -102'), ('10/30 12:38 PM', '+13.5 -107', '-13.5 -103'), ('10/30 12:35 PM', '+13 -102', '-13 -108'), ('10/30 12:34 PM', '+13 -103', '-13 -107'), ('10/30 12:06 PM', '+13.5 -110', '-13.5 +100'), ('10/30 11:57 AM', '+13.5 -108', '-13.5 -102'), ('10/30 11:36 AM', '+13.5 -107', '-13.5 -103'), ('10/30 09:01 AM', '+13.5 -110', '-13.5 +100'), ('10/30 08:59 AM', '+13.5 -108', '-13.5 -102'), ('10/30 08:13 AM', '+13.5 -105', '-13.5 -105'), ('10/30 06:11 AM', '+13.5 +100', '-13.5 -110'), ('10/30 06:09 AM', '+13.5 -105', '-13.5 -105'), ('10/30 06:04 AM', '+13.5 -110', '-13.5 +100'), ('10/30 05:32 AM', '+13.5 -105', '-13.5 -105'), ('10/30 04:48 AM', '+13.5 -107', '-13.5 -103'), ('10/30 12:51 AM', '+13.5 -110', '-13.5 +100'), ('10/29 01:31 PM', '+13.5 -105', '-13.5 -105'), ('10/29 01:31 PM', '+13 +103', '-13 -113'), ('10/29 01:28 PM', '+13 -102', '-13 -108'), ('10/29 07:59 AM', '+13 -105', '-13 -105'), ('10/29 07:20 AM', '+13 -103', '-13 -107'), ('10/29 07:14 AM', '+13 -105', '-13 -105'), ('10/29 04:47 AM', '+13 +100', '-13 -110'), ('10/29 04:14 AM', '+13 -105', '-13 -105'), ('10/28 08:17 PM', '+12.5 +100', '-12.5 -110'), ('10/28 12:52 PM', '+12.5 -105', '-12.5 -105')]

Big list to test2:

[('10/30 04:30 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:24 PM', '+1.5 -110', '-1.5     +100'), ('10/30 04:21 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:15 PM', '+1.5 -112', '-1.5 +102'), ('10/30 04:14 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:57 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:40 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:31 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:30 PM', '+1.5 -109', '-1.5 -101'), ('10/30 03:25 PM', '+1.5 -107', '-1.5 -103'), ('10/30 03:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:23 PM', '+1.5 -108', '-1.5 -102'), ('10/30 03:22 PM', '+1.5 -106', '-1.5 -104'), ('10/30 02:14 PM', '+1.5 -104', '-1.5 -106'), ('10/30 01:41 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:37 PM', '+1.5 -107', '-1.5 -103'), ('10/30 01:36 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:06 PM', '+1.5 -103', '-1.5 -107'), ('10/30 12:56 PM', '+2 -111', '-2 +101'), ('10/30 12:53 PM', '+2 -110', '-2 +100'), ('10/30 12:50 PM', '+2 -113', '-2 +103'), ('10/30 12:49 PM', '+2 -112', '-2 +102'), ('10/30 12:46 PM', '+2 -113', '-2 +103'), ('10/30 12:45 PM', '+2 -110', '-2 +100'), ('10/30 12:43 PM', '+2 -108', '-2 -102'), ('10/30 12:38 PM', '+2.5 -116', '-2.5 +106'), ('10/30 12:38 PM', '+2.5 -113', '-2.5 +103'), ('10/30 12:37 PM', '+2.5 -110', '-2.5 +100'), ('10/30 10:30 AM', '+2.5 -105', '-2.5 -105'), ('10/30 10:07 AM', '+3 -113', '-3 +103'), ('10/30 09:55 AM', '+3 -112', '-3 +102'), ('10/30 09:51 AM', '+3 -110', '-3 +100'), ('10/30 09:32 AM', '+3 -109', '-3 -101'), ('10/30 06:04 AM', '+3 -110', '-3 +100'), ('10/30 03:16 AM', '+3 -107', '-3 -103'), ('10/30 03:14 AM', '+3.5 -116', '-3.5 +106'), ('10/30 01:03 AM', '+3.5 -115', '-3.5 +105'), ('10/30 12:17 AM', '+3.5 -110', '-3.5 +100'), ('10/29 08:52 PM', '+3.5 -108', '-3.5 -102'), ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105'), ('10/29 06:48 AM', '+3.5 -110', '-3.5 +100'), ('10/29 06:47 AM', '+3.5 -109', '-3.5 -101'), ('10/29 05:39 AM', '+3.5 -113', '-3.5 +103'), ('10/29 03:34 AM', '+3.5 -108', '-3.5 -102'), ('10/29 12:44 AM', '+3.5 -110', '-3.5 +100'), ('10/29 12:41 AM', '+3.5 -107', '-3.5 -103'), ('10/29 12:40 AM', '+3.5 -105', '-3.5 -105'), ('10/28 12:52 PM', '+4 -105', '-4 -105')]
nutship
  • 4,624
  • 13
  • 47
  • 64

8 Answers8

13

This can be done using datetime module, which is able to parse date string into datetime object, which supports comparison and arithmetic with dates:

from datetime import datetime

# function for parsing strings using specific format
get_datetime = lambda s: datetime.strptime(s, "%m/%d %I:%M %p")

base = get_datetime(base_date)
later = filter(lambda d: get_datetime(d[0]) > base, list_date)
closest_date = min(later, key = lambda d: get_datetime(d[0]))
Andrei
  • 55,890
  • 9
  • 87
  • 108
  • Neat, a very pythonic solution :-) – Nils Werner Jun 22 '13 at 09:58
  • For more on date parsing, check the [datetime documentation](http://docs.python.org/2/library/datetime.html#strftime-strptime-behavior). It's able to parse dates formatted in basically any way you can think of. – Imre Kerr Jun 22 '13 at 10:00
  • 2
    @NilsWerner Indeed, very pythonic. How can I tell? Over half the lines contain a `lambda`. (I kid of course, it is a very nice solution.) – Imre Kerr Jun 22 '13 at 10:05
  • I love this solution, but it returns wrong output, I'm afraid. – nutship Jun 22 '13 at 10:11
  • @nutship, can you give any specifics? It works on the example from the question. – Andrei Jun 22 '13 at 10:13
  • Sorry it takes me that long to answer but needed to make certain. I checked against a long list (see edited OP) and it gives me the last tuple (the one which has the earliest date) from the list not the closest one. – nutship Jun 22 '13 at 10:30
  • @nutship, it gives me `('10/29 07:14 AM', '+13 -105', '-13 -105')` which seems to be right - it really the closest one to `'10/29 06:58 AM'`. What does it give you? – Andrei Jun 22 '13 at 10:31
  • hmmm, it gave me `('10/28 12:52 PM', '+12.5 -105', '-12.5 -105')` Alright, will check that again. – nutship Jun 22 '13 at 10:32
  • No idea why I'm getting the wrong output all the time. Tested in various interpreters. Thanks anyway. btw. does your solution distinguish between AM and PM? – nutship Jun 22 '13 at 10:44
  • 1
    If you really want things to "look functional", scrap all those lambdas and use `partial` and `compose`. Isn't `filter(compose(partial(operator.gt, base), compose(operator.itemgetter(0), get_datetime), list_date)` much nicer? :) – abarnert Jun 22 '13 at 11:11
  • @Andrei, I'm sorry I did not accept your solution, but it just did not work for me correctly, although I love your syntax. – nutship Jun 22 '13 at 11:25
  • @nutship, if you do not mind I would like to sort it out. Please join me in this [chat](http://chat.stackoverflow.com/rooms/32216/getting-the-closest-date-to-a-given-date) - you should have a write access there. – Andrei Jun 22 '13 at 15:28
  • I tried using this for a dataframe column by doing necessary modifications, but didn't succeed. Please check https://stackoverflow.com/questions/57604025/find-first-timestamp-in-column-greater-than-the-timestamp-of-every-row-by-a-give – Legolas Aug 22 '19 at 07:25
12
>>> from datetime import timedelta, datetime
>>> base_date = "10/29 06:58 AM"
>>> b_d = datetime.strptime(base_date, "%m/%d %I:%M %p")
def func(x):
    d =  datetime.strptime(x[0], "%m/%d %I:%M %p")
    delta =  d - b_d if d > b_d else timedelta.max
    return delta
... 
>>> min(list_date, key = func)
('10/30 02:17 PM', '+100', '-110')

datetime.strptime converts the date to a datetime object, so b_d now looks something like this :

>>> b_d
datetime.datetime(1900, 10, 29, 6, 58)

Now we can write a function that can be passed to key parameter of min:

delta =  d - b_d if d > b_d else timedelta.max

if d > b_d i.e if the date passed to min is greater than base_date then assign their difference to delta else assign timedelta.max to it.

>>> timedelta.max
datetime.timedelta(999999999, 86399, 999999)

Update:

>>> from datetime import timedelta, datetime
>>> base_date = '10/29 06:59 AM'
>>> b_d = datetime.strptime(base_date, "%m/%d %I:%M %p")
>>> def func(x):
...         d =  datetime.strptime(x[0], "%m/%d %I:%M %p")
...         delta =  d - b_d if d > b_d else timedelta.max
...         return delta
... 
>>> lis2 = [('10/30 04:30 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:24 PM', '+1.5 -110', '-1.5     +100'), ('10/30 04:21 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:15 PM', '+1.5 -112', '-1.5 +102'), ('10/30 04:14 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:57 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:40 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:31 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:30 PM', '+1.5 -109', '-1.5 -101'), ('10/30 03:25 PM', '+1.5 -107', '-1.5 -103'), ('10/30 03:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:23 PM', '+1.5 -108', '-1.5 -102'), ('10/30 03:22 PM', '+1.5 -106', '-1.5 -104'), ('10/30 02:14 PM', '+1.5 -104', '-1.5 -106'), ('10/30 01:41 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:37 PM', '+1.5 -107', '-1.5 -103'), ('10/30 01:36 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:06 PM', '+1.5 -103', '-1.5 -107'), ('10/30 12:56 PM', '+2 -111', '-2 +101'), ('10/30 12:53 PM', '+2 -110', '-2 +100'), ('10/30 12:50 PM', '+2 -113', '-2 +103'), ('10/30 12:49 PM', '+2 -112', '-2 +102'), ('10/30 12:46 PM', '+2 -113', '-2 +103'), ('10/30 12:45 PM', '+2 -110', '-2 +100'), ('10/30 12:43 PM', '+2 -108', '-2 -102'), ('10/30 12:38 PM', '+2.5 -116', '-2.5 +106'), ('10/30 12:38 PM', '+2.5 -113', '-2.5 +103'), ('10/30 12:37 PM', '+2.5 -110', '-2.5 +100'), ('10/30 10:30 AM', '+2.5 -105', '-2.5 -105'), ('10/30 10:07 AM', '+3 -113', '-3 +103'), ('10/30 09:55 AM', '+3 -112', '-3 +102'), ('10/30 09:51 AM', '+3 -110', '-3 +100'), ('10/30 09:32 AM', '+3 -109', '-3 -101'), ('10/30 06:04 AM', '+3 -110', '-3 +100'), ('10/30 03:16 AM', '+3 -107', '-3 -103'), ('10/30 03:14 AM', '+3.5 -116', '-3.5 +106'), ('10/30 01:03 AM', '+3.5 -115', '-3.5 +105'), ('10/30 12:17 AM', '+3.5 -110', '-3.5 +100'), ('10/29 08:52 PM', '+3.5 -108', '-3.5 -102'), ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105'), ('10/29 06:48 AM', '+3.5 -110', '-3.5 +100'), ('10/29 06:47 AM', '+3.5 -109', '-3.5 -101'), ('10/29 05:39 AM', '+3.5 -113', '-3.5 +103'), ('10/29 03:34 AM', '+3.5 -108', '-3.5 -102'), ('10/29 12:44 AM', '+3.5 -110', '-3.5 +100'), ('10/29 12:41 AM', '+3.5 -107', '-3.5 -103'), ('10/29 12:40 AM', '+3.5 -105', '-3.5 -105'), ('10/28 12:52 PM', '+4 -105', '-4 -105')]
>>> min(lis2, key = func)
('10/29 01:31 PM', '+3.5 -105', '-3.5 -105')

Timing comparisons:

Script:

from datetime import datetime, timedelta
import sys
import time
list_date = [('10/30 04:30 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:24 PM', '+1.5 -110', '-1.5     +100'), ('10/30 04:21 PM', '+1.5 -111', '-1.5 +101'), ('10/30 04:15 PM', '+1.5 -112', '-1.5 +102'), ('10/30 04:14 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:57 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:40 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:31 PM', '+1.5 -111', '-1.5 +101'), ('10/30 03:30 PM', '+1.5 -109', '-1.5 -101'), ('10/30 03:25 PM', '+1.5 -107', '-1.5 -103'), ('10/30 03:24 PM', '+1.5 -110', '-1.5 +100'), ('10/30 03:23 PM', '+1.5 -108', '-1.5 -102'), ('10/30 03:22 PM', '+1.5 -106', '-1.5 -104'), ('10/30 02:14 PM', '+1.5 -104', '-1.5 -106'), ('10/30 01:41 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:37 PM', '+1.5 -107', '-1.5 -103'), ('10/30 01:36 PM', '+1.5 -105', '-1.5 -105'), ('10/30 01:06 PM', '+1.5 -103', '-1.5 -107'), ('10/30 12:56 PM', '+2 -111', '-2 +101'), ('10/30 12:53 PM', '+2 -110', '-2 +100'), ('10/30 12:50 PM', '+2 -113', '-2 +103'), ('10/30 12:49 PM', '+2 -112', '-2 +102'), ('10/30 12:46 PM', '+2 -113', '-2 +103'), ('10/30 12:45 PM', '+2 -110', '-2 +100'), ('10/30 12:43 PM', '+2 -108', '-2 -102'), ('10/30 12:38 PM', '+2.5 -116', '-2.5 +106'), ('10/30 12:38 PM', '+2.5 -113', '-2.5 +103'), ('10/30 12:37 PM', '+2.5 -110', '-2.5 +100'), ('10/30 10:30 AM', '+2.5 -105', '-2.5 -105'), ('10/30 10:07 AM', '+3 -113', '-3 +103'), ('10/30 09:55 AM', '+3 -112', '-3 +102'), ('10/30 09:51 AM', '+3 -110', '-3 +100'), ('10/30 09:32 AM', '+3 -109', '-3 -101'), ('10/30 06:04 AM', '+3 -110', '-3 +100'), ('10/30 03:16 AM', '+3 -107', '-3 -103'), ('10/30 03:14 AM', '+3.5 -116', '-3.5 +106'), ('10/30 01:03 AM', '+3.5 -115', '-3.5 +105'), ('10/30 12:17 AM', '+3.5 -110', '-3.5 +100'), ('10/29 08:52 PM', '+3.5 -108', '-3.5 -102'), ('10/29 01:31 PM', '+3.5 -105', '-3.5 -105'), ('10/29 06:48 AM', '+3.5 -110', '-3.5 +100'), ('10/29 06:47 AM', '+3.5 -109', '-3.5 -101'), ('10/29 05:39 AM', '+3.5 -113', '-3.5 +103'), ('10/29 03:34 AM', '+3.5 -108', '-3.5 -102'), ('10/29 12:44 AM', '+3.5 -110', '-3.5 +100'), ('10/29 12:41 AM', '+3.5 -107', '-3.5 -103'), ('10/29 12:40 AM', '+3.5 -105', '-3.5 -105'), ('10/28 12:52 PM', '+4 -105', '-4 -105')]

base_date = "10/29 06:58 AM"

def func1(list_date):
    #http://stackoverflow.com/a/17249420/846892
    get_datetime = lambda s: datetime.strptime(s, "%m/%d %I:%M %p")
    base = get_datetime(base_date)
    later = filter(lambda d: get_datetime(d[0]) > base, list_date)
    return min(later, key = lambda d: get_datetime(d[0]))

def func2(list_date):
    #http://stackoverflow.com/a/17249470/846892
    b_d = datetime.strptime(base_date, "%m/%d %I:%M %p")
    def func(x):
       d =  datetime.strptime(x[0], "%m/%d %I:%M %p")
       delta =  d - b_d if d > b_d else timedelta.max
       return delta
    return min(list_date, key = func)

def func3(list_date):
    #http://stackoverflow.com/a/17249529/846892
    fmt = '%m/%d %I:%M %p'
    d = datetime.strptime(base_date, fmt)
    def foo(x):
        return (datetime.strptime(x[0],fmt)-d).total_seconds() > 0
    return sorted(list_date, key=foo)[-1]

def func4(list_date):
    #http://stackoverflow.com/a/17249441/846892
    fmt = '%m/%d %I:%M %p'
    base_d = datetime.strptime(base_date, fmt)
    candidates = ((datetime.strptime(d, fmt), d, x, y) for d, x, y in list_date)
    candidates = min((dt, d, x, y) for dt, d, x, y in candidates if dt > base_d)
    return  candidates[1:]

Results:

>>> from so import *

#check output irst
>>> func1(list_date)
('10/29 01:31 PM', '+3.5 -105', '-3.5 -105')
>>> func2(list_date)
('10/29 01:31 PM', '+3.5 -105', '-3.5 -105')
>>> func3(list_date)
('10/29 01:31 PM', '+3.5 -105', '-3.5 -105')
>>> func4(list_date)
('10/29 01:31 PM', '+3.5 -105', '-3.5 -105')

>>> %timeit func1(list_date)
100 loops, best of 3: 3.07 ms per loop
>>> %timeit func2(list_date)
100 loops, best of 3: 1.59 ms per loop      #winner
>>> %timeit func3(list_date)
100 loops, best of 3: 1.91 ms per loop
>>> %timeit func4(list_date)
1000 loops, best of 3: 2.02 ms per loop

#increase the input size
>>> list_date = list_date *10**3
>>> len(list_date)
48000
>>> %timeit func1(list_date)
1 loops, best of 3: 3.6 s per loop
>>> %timeit func2(list_date)            #winner
1 loops, best of 3: 1.99 s per loop      
>>> %timeit func3(list_date)
1 loops, best of 3: 2.09 s per loop
>>> %timeit func4(list_date)
1 loops, best of 3: 2.02 s per loop


#increase the input size again

>>> list_date = list_date *10
>>> len(list_date)
480000
>>> %timeit func1(list_date)
1 loops, best of 3: 36.4 s per loop
>>> %timeit func2(list_date)                  #winner
1 loops, best of 3: 20.2 s per loop           
>>> %timeit func3(list_date)
1 loops, best of 3: 22.8 s per loop
>>> %timeit func4(list_date)
1 loops, best of 3: 22.7 s per loop
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • Hi, thanks for you solution, but does it distinguish between AM and PM? – nutship Jun 22 '13 at 10:43
  • Can you please check against `Big list to test2` from the OP if you got time? And as a `base_date` please insert `10/29 06:59 AM`. The correct output should be `('10/29 01:31 PM', '+3.5 -105', '-3.5 -105')` but your code produces `('10/29 08:52 PM', '+3.5 -108', '-3.5 -102')` which is incorrect. – nutship Jun 22 '13 at 10:54
  • 1
    @nutship see my updated solution, I'm getting correct output. – Ashwini Chaudhary Jun 22 '13 at 11:00
  • Hi, no actually your code has always worked well for me, but this time, since Andrei's solution got 5 votes up I thought it's maybe fair to accept his answer, hmm? – nutship Jun 24 '13 at 17:18
  • @nutship I'll compare timing results of the solutions posted on this thread and post them(choose the best one based on that), you should not select an answer just because it got more votes.(This doesn't imply Andrei's solution is wrong) – Ashwini Chaudhary Jun 24 '13 at 17:42
  • @nutship I've added the results, as expected my solution is the fastest(as my solution contains no python for-loop). While Andrei's solution is slowest(filter with lambda is slow as well as unpythonic). You can time them yourself using the `timeit` module. I've not added Skyler & Sylvain's solutions as I was getting some error, but I am sure their solution are surely slower than mine becuase `min()` is way fster than simple python for-loop. Burhan's solution is not memory efficient as well time cmplxity wise it's `O(NlogN)` due to sorting(`min()`is`O(N)`), and sorted() also creates a new list. – Ashwini Chaudhary Jun 24 '13 at 18:56
2

decorate, filter, find the closest date, undecorate

>>> base_date = "10/29 06:58 AM"
>>> list_date = [
...     ('10/30 02:18 PM', '-103', '-107'),
...     ('10/30 02:17 PM', '+100', '-110'),
...     ('10/29 02:15 AM', '-101', '-109')
... ]
>>> import datetime
>>> fmt = '%m/%d %H:%M %p'
>>> base_d = datetime.datetime.strptime(base_date, fmt)
>>> candidates = ((datetime.datetime.strptime(d, fmt), d, x, y) for d, x, y in list_date)
>>> candidates = min((dt, d, x, y) for dt, d, x, y in candidates if dt > base_d)
>>> print candidates[1:]
('10/30 02:17 PM', '+100', '-110')
falsetru
  • 357,413
  • 63
  • 732
  • 636
2

You can consider putting the dates list into a Pandas index and then use 'truncate' or 'get_loc' function.

import pandas as pd

##Initial inputs
list_date = [('10/30 02:18 PM', '-103', '-107'),('10/29 02:15 AM', '-101', '-109') , ('10/30 02:17 PM', '+100', '-110'), \
             ]  # reordered to show the method is input order insensitive
base_date = "10/29 06:58 AM"


##Make a data frame with data
df=pd.DataFrame(list_date)
df.columns=['date','val1','val2']
dateIndex=pd.to_datetime(df['date'], format='%m/%d %I:%M %p')
df=df.set_index(dateIndex) 
df=df.sort_index(ascending=False) #earliest comes on top 

##Find the result
base_dateObj=pd.to_datetime(base_date, format='%m/%d %I:%M %p')
result=df.truncate(after=base_dateObj).iloc[-1]  #take the bottom value, or the 1st after the base date
(result['date'],result['val1'], result['val2']) # result is ('10/30 02:17 PM', '+100', '-110')

Reference: this link

Community
  • 1
  • 1
Kevin Zhu
  • 2,746
  • 26
  • 23
1

Linear search?

import sys
import time

base_date = "10/29 06:58 AM"

def str_to_my_time(my_str):
    return time.mktime(time.strptime(my_str, "%m/%d %I:%M %p")) 
                # assume year 1900...

base_dt = str_to_my_time(base_date)

list_date = [('10/30 02:18 PM', '-103', '-107'), 
             ('10/30 02:17 PM', '+100', '-110'),
             ('10/29 02:15 AM', '-101', '-109')]


best_delta = sys.maxint
best_match = None

for t in list_date:
    the_dt = str_to_my_time(t[0])
    delta_sec = the_dt - base_dt
    if (delta_sec >= 0) and (delta_sec < best_delta):
        best_delta = delta_sec
        best_match = t

print best_match, best_delta

Producing:

('10/30 02:17 PM', '+100', '-110') 112740.0
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
1
import time
import sys

#The Function
def to_sec(date_string):
    return time.mktime(time.strptime(date_string, '%m/%d %I:%M %p'))


#The Test
base_date = "10/29 06:58 AM"
base_date_sec = to_sec(base_date)
result = None
difference = sys.maxint
list_date = [
        ('10/30 02:18 PM', '-103', '-107'),
        ('10/30 02:17 PM', '+100', '-110'), 
        ('10/29 02:15 AM', '-101', '-109') ]
for date_str in list_date:
    diff_sec = to_sec(date_str[0])-base_date_sec
    if diff_sec >= 0 and diff_sec < difference:
        result = date_str
        difference = diff_sec
print result
Timothy
  • 4,467
  • 5
  • 28
  • 51
1
import datetime

fmt = '%m/%d %H:%M %p'
d = datetime.datetime.strptime(base_date, fmt)
def foo(x):
   return (datetime.datetime.strptime(x[0],fmt)-d).total_seconds() > 0
sorted(list_date, key=foo)[-1]
Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
0

I was looking up this problem and found some answers, most of which check all elements. I have my dates sorted (and assume most people do), so if you do as well, use numpy:

import numpy as np
// dates is a numpy array of np.datetime64 objects
dates = np.array([date1, date2, date3, ...], dtype=np.datetime64)
timestamp = np.datetime64('Your date')
np.searchsorted(dates, timestamp)

searchsorted uses binary search, which uses the fact the dates are sorted, and is thus very efficient. If you use pandas, this is possible:

dates = df.index # df is a DatetimeIndex-ed dataframe
timestamp = pd.to_datetime('your date here', format='its format')
np.searchsorted(dates, timestamp)

The function returns the index of the closest date (if the searched date is included in dates, its index is returned [if that isn't wanted, use side='right' as an argument into the function]), so to get the date do this:

dates[np.searchsorted(dates, timestamp)]
timlod
  • 1
  • 1