0

I have a dictionary named 'times' which maps keys to string values that represent a time:

times = {'key1': '12.23', 'key2': '43:53.29', 'key3': '1:38:11.50r'}

The string takes the form of [hours]:[minutes]:[seconds].[milliseconds][r] where every field is optional. The r is a flag that doesn't depend on any other values being filled in and doesn't factor into sorting. [hours] requires that [minutes] and down are present, but [minutes] doesn't require [hours] to be present.

I want to end up with a list of keys sorted by the time-ordering of their values.

I have the following:

standings = sorted(times, key=times.__getitem__)

but it only sorts based on a string value. I'm new to python, but if I were using java I would probably write a Time class with a custom compareTo() function to get the sort to work.

I could write a function that converts the string to a time in milliseconds, then sort based on that, but don't know how I would do so using 'key=' in the sorted() function.

PearSquirrel
  • 595
  • 2
  • 7
  • 17

3 Answers3

2
import re
def as_list(time):
    """
    >>> as_list('1:38:11.50r')
    [1, 38, 11, 50]
    >>> as_list('2.23')
    [0, 0, 2, 23]
    """
    # Extract times and convert to integers
    times = [int(x) for x in re.split(r"[:.]", re.sub("[a-z]$", "", time))]
    # If needed pad from the left side with zeros and return
    return times if len(times) == 4 else [0] * (4 - len(times)) + times

[k for k, t in sorted(times.items(), key = lambda x: as_list(x[1]))]

Or even more concise way:

[key for _, key in sorted((as_list(v), k) for k, v in times.items())]

It works because lists or tuples in Python are sorted in a lexicographical order. Let's say you have list as follows:

>>> l = [[0, 1], [-1 , 2, 3], [4, 5], [0, -1]]

You can call sorted on it

>>> sorted(l)
[[-1, 2, 3], [0, -1], [0, 1], [4, 5]]

Hence all the magic.

Regarding [0] * (4 - len(times)) + times you can read more here: Create List of Single Item Repeated n Times in Python

Long story short some_list * some_integer creates a list that contains elements of some_list repeated some_integer times.

Community
  • 1
  • 1
zero323
  • 322,348
  • 103
  • 959
  • 935
  • It looks like this works! I understand splitting based on ':' and '.', ignoring any non-numeric characters. Could you explain "return times if len(times) == 4 else [0] * (4 - len(times)) + times" and "[k for k, t in sorted(times.items(), key = lambda x: as_list(x[1]))]"? – PearSquirrel Dec 19 '14 at 01:37
  • You don't need `\.`, `.` will not have its special meaning with `[]`. Also, use raw strings. – thefourtheye Dec 19 '14 at 01:58
  • @thefourtheye Updated, thanks. I've spend to much time with R lately and I feel like I have to escape everything and then escape once again. – zero323 Dec 19 '14 at 02:02
0

You can do the following:

standings = sorted(times.items(),key = lambda t : t[0])

it assumes that you want to sort by the key of the dictionary. If you want to sort by value then replace t[0] with t[1].

Note that I used t[0] to get a reference to the key as an example but in your case you would reference t[1] but you would probably pass this value to a function that will convert the time into a string format that would be easy to sort lexicographically. For instance, let us assume you have a time_format method that returns a padded time then you would replace the t[0] above with time_format(t[1]).

Claudio Corsi
  • 329
  • 1
  • 4
0

I think you can define a function to_decimal to convert the time string to a decimal for comparison, then:

standings = sorted(times, key = lambda x : to_decimal(times[x]))
ciphor
  • 8,018
  • 11
  • 53
  • 70