sort a list using lambda and regex in python

Question

list = ['xxxx ResultDatetime:2017-05-31 09:38:00.000:ResultDatetime', 'xxxx ResultDatetime:2017-05-26 15:36:00.000:ResultDatetime', 'yyyyy' ResultDatetime:2017-10-23 16:16:00.000:ResultDatetime]

datet = re.compile(r'ResultDatetime:(\d{4}-\d{2}-\d{2} \d{2}:\d{2})')

list.sort(key = lambda x: ........)

I want to sort the lists in an order starting with the earliest date. How should I go about it using lambda and regex?

Why do you have these weird strings? What is the expected output for the given list? — timgeb, Nov 14 '18 at 16:33
sorry the original string had '<' characters in it which interfered with the way it was displayed. I have edited the question as you can see now — dratoms, Nov 14 '18 at 16:34
Avoid `list` as a variable name, there's already the builtin `list`. — timgeb, Nov 14 '18 at 16:51

score 2 · Answer 1 · answered Nov 14 '18 at 16:54

With the code you have there it is sufficient to do:

list.sort(key=lambda x: datet.search(x).group(1))

(but please, don't use list as a variable name).

There is no need to convert the extracted string to a datetime as it is already in a format that will sort naturally.

Note however that if any string does not match the regex this will generate an error, so you may be better to split the key out into a named multi-line function and test for a successful match before returning the matched group.

def sort_key(line):                                                                                                                                               
    match = datet.search(line)                                                                                                                                               
    if match:                                                                                                                                                     
        return match.group(1)                                                                                                                                                    
    return ''        

data = [
    'xxxx ResultDatetime:2017-05-31 09:38:00.000:ResultDatetime',
    'xxxx ResultDatetime:2017-05-26 15:36:00.000:ResultDatetime',
    'yyyyy ResultDatetime:2017-10-23 16:16:00.000:ResultDatetime'
]
data.sort(key=sort_key)

thanks for that syntax. that was elusive to me. and thanks for that neat little function there. Although the list element part is autogenerated and is unlikely that there will be missing values, your function is going to help me a lot in future, a newbie to python (and programming in general) that I am. — dratoms, Nov 14 '18 at 17:13

score 0 · Answer 2 · answered Nov 14 '18 at 16:58

You can use dateutil.parser.parse (see this answer: Parse date strings?) to parse the date and re.findall to get it from a string

import re     
from dateutil.parser import parse

list = ['xxxx ResultDatetime:2017-05-31 09:38:00.000:ResultDatetime', 'xxxx ResultDatetime:2017-05-26 15:36:00.000:ResultDatetime', 'yyyyy ResultDatetime:2017-10-23 16:16:00.000:ResultDatetime]
datet = re.compile(r'ResultDatetime:(\d{4}-\d{2}-\d{2} \d{2}:\d{2})')

list.sort(key = lambda x : parse(re.findall(datet, x)[0]))

I haven't used dateutil so far. But it seems promising. Will keep this in mind. — dratoms, Nov 14 '18 at 17:15

score 0 · Answer 3 · answered Nov 14 '18 at 17:06

0

I think the simplest solution without any imports would be:

data  = ['xxxx ResultDatetime:2017-05-31 09:38:00.000:ResultDatetime',
         'xxxx ResultDatetime:2017-05-26 15:36:00.000:ResultDatetime', 
         'yyyyy ResultDatetime:2017-10-23 16:16:00.000:ResultDatetime']

sorted_data = sorted(data, key=lambda x: x[20:36])

print(sorted_data)

Output:

        ['xxxx ResultDatetime:2017-05-26 15:36:00.000:ResultDatetime', 
         'xxxx ResultDatetime:2017-05-31 09:38:00.000:ResultDatetime', 
         'yyyyy ResultDatetime:2017-10-23 16:16:00.000:ResultDatetime']

answered Nov 14 '18 at 17:06

Nick

3,454
6
33
56

The last string has the date at a slightly different offset. I think the OP's intention is that xxxx and yyyyy could be any arbitrarily long strings. – Duncan Nov 14 '18 at 17:08
exactly. and there could be other string numbers before the regex pattern that would impede in natural sorting here. – dratoms Nov 14 '18 at 17:20

sort a list using lambda and regex in python

3 Answers3