9

I have a list of durations like below

['5d', '20h', '1h', '7m', '14d', '1m']

where d stands for days, h stands for hours and m stands for minutes.

I want to get the highest duration from this list(14d in this case). How can I get that from this list of strings?

martineau
  • 119,623
  • 25
  • 170
  • 301
Rafiul Sabbir
  • 626
  • 6
  • 21
  • 1
    https://stackoverflow.com/a/4628148/1224467 This answer has a solution to turn your strings into timedeltas. These can be sorted. – H4kor Jan 17 '20 at 14:13

7 Answers7

15

np.argmax on pd.to_timedelta:

import numpy as np
import pandas as pd

durations = ['5d', '20h', '1h', '7m', '14d', '1m']

durations[np.argmax(pd.to_timedelta(durations))]
Out[24]: '14d'

pd.to_timedelta turns a string into a duration (source), and np.argmax returns the index of the highest element.

Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
  • Although I am not going to use numpy and/or pandas for the project I am having this issue, but I must say that's an amazing example of using library functions efficiently to get your job done. – Rafiul Sabbir Jan 17 '20 at 14:23
13

Pure python solution. We could store mapping between our time extensions (m, h, d) and minutes (here time_map), to find highest duration. Here we're using max() with key argument to apply our mapping.

inp = ['5d', '20h', '1h', '7m', '14d', '1m']
time_map = {'m': 1, 'h': 60, 'd': 24*60}

print(max(inp, key=lambda x:int(x[:-1])*time_map[x[-1]]))  # -> 14d
Clément
  • 1,128
  • 7
  • 21
Filip Młynarski
  • 3,534
  • 1
  • 10
  • 22
5

Here's an absolute hack which solves the problem in a bad but clever way: Python's min and max functions can be used with a key function which is used to compare elements, so that it returns the element minimising or maximising that function. If the key function returns a tuple, then the order is determined by the first component of the tuple, using the second component as a tie-breaker.

We can exploit the fact that the last characters 'd', 'h' and 'm' can be compared in alphabetical order; a day is longer than an hour is longer than a minute. This means the longest duration has the minimum character in alphabetical order, with the maximum integer as a tie-breaker. Maximising that integer is the same as minimising its negation:

>>> durations = ['5d', '20h', '1h', '7m', '14d', '1m']
>>> min(durations, key=lambda d: (d[-1], -int(d[:-1])))
'14d'
kaya3
  • 47,440
  • 4
  • 68
  • 97
  • 1
    You could have seconds represented as `s`, and weeks as `W` and months as `M`, and it would still work; the hack breaks if you include years as either `y` or `Y`, though. It also relies on there not being any durations like `10000000m` where a larger unit could be used instead; I'm assuming the strings are the output of some API which always uses the largest unit possible. – kaya3 Jan 17 '20 at 14:34
3
lst = ['5d', '20h', '1h', '7m', '14d', '1m']
max(lst, key=lambda s: (-ord(s[-1]), int(s[:-1])))

Output:

'14d'

Useful for this particular set of strings, but if the format differs, will need to adjust the first element of the tuple accordingly. Right now it makes sense because s > m > h > d.

r.ook
  • 13,466
  • 2
  • 22
  • 39
3

here is a solution with regular expression

import numpy as np
import re

new_list = []
x=['5d', '20h', '1h', '7m', '14d', '1m']
map_time={"d":1440, "h":60, "m":1}

for item in x:
    letter=re.findall("[a-zA-Z]+",item)
    number=re.findall("[1-9]+",item)
    new_list.append(map_time[letter[0]]*int(number[0]))

x[np.argmax(new_list)]
Kingindanord
  • 1,754
  • 2
  • 19
  • 48
2

Provided that your times are well-formed, you can find the max based on a single regular expression:

>>> import re
>>>
>>> durations = ['5d', '20h', '1h', '7m', '14d', '1m']
>>> pattern = re.compile(r'(?:(\d*)d)?(?:(\d*)h)?(?:(\d*)m)?')
>>> max(inp, key=lambda tme: tuple(map(int, pattern.match(tme).groups(default=0))))
'14d'

The regular expression creates a tuple of days, hours, minutes as strings. The tuple(map(int, ...)) converts it to integers. max picks the largest of these tuples, which naturally weight days stronger than hours stringer than minutes.

MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119
1

One possible way :

duration = ['5d', '20h', '1h', '7m', '14d', '1m', '2d']
duration_std = [0]*len(duration)

equivalence = {"d":60*60*24, "h":60*60, "m":60}

for idx, val in enumerate(duration):
    duration_std[idx] = int(val[:-1])*equivalence[val[-1]]

print(duration[duration_std.index(max(duration_std))])

Output

"14d"
Clément
  • 1,128
  • 7
  • 21