8
data = ['str', 'frt']
max(data, key=len)

The max function returns only one of the strings.

How can I make it return both of the strings?

The length of both strings is equal, so max should return both the strings but it returns only one so is there a way to return all max items?

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
gabber12
  • 1,114
  • 2
  • 11
  • 18
  • 1
    What do you mean? Your question is confusing. Can you give a better example, and show what output you want? – Mark Byers May 30 '12 at 19:51
  • 2
    This might be relevant: http://stackoverflow.com/questions/9853302/using-pythons-max-to-return-two-equally-large-values – abought May 30 '12 at 19:52

4 Answers4

8

You can write this as a list comprehension:

data = ['str', 'frt']
maxlen = max(map(len, data))
result = [s for s in data if len(s) == maxlen]
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
  • Note: it's crucial that maxlen is calculated outside of the list comprehension (and hence only called once). – Andy Hayden Oct 30 '13 at 22:27
  • 5
    Also, more pythonic (and *slightly* faster) to use `max(a, key=len)` in the second line, rather than max.map. – Andy Hayden Oct 30 '13 at 22:31
  • 2
    @AndyHayden In that case, `maxlen` would be the actual string, not 3, so you would have to call `len` again to get the length. Seems more appropriate to do this here. – Anakhand Dec 12 '20 at 19:02
4

Here's a simple function which does this in one pass:

def maxes(a, key=None):
    if key is None:
        key = lambda x: x
    m, max_list = key(a[0]), []
    for s in a:
        k = key(s)
        if k > m:
            m, max_list = k, [s]
        elif k == m:
            max_list.append(s)
    return m, max_list

In action:

In [11]: maxes(['a', 'ab', 'a', 'cd'], key=len)
Out[11]: (2, ['ab', 'cd'])

This may, or may not be faster than running the list comprehension mentioned by the other poster and certainly faster than the sorting... but a little testing suggests its faster:

For a example of strings:

In [20]: a = [''.join(random.choice('abc') for _ in xrange(random.randint(1, 100)))
                                           for i in xrange(1000)]

In [21]: %timeit maxes(a, key=len)
10000 loops, best of 3: 53 µs per loop

In [22]: %timeit m = max(map(len, a)); [s for s in a if len(s) < m]
10000 loops, best of 3: 104 µs per loop

In [23]: %timeit sorted_a = sorted(a, key=len, reverse=True); [s for s in a if len(s) == len(sorted_a[0])]
1000 loops, best of 3: 322 µs per loop

If we look at integers, with a key:

In [30]: a = [random.randint(1, 10000) for i in xrange(1000)]

In [31]: %timeit maxes(a, key= lambda x: x**2)
10000 loops, best of 3: 150 µs per loop

In [32]: %timeit m = max(a, key=lambda x: x**2); [s for s in a if s**2 < m]
1000 loops, best of 3: 183 µs per loop

In [33]: %timeit sorted_a = sorted(a, key=lambda x: x**2, reverse=True); [s for s in a if s ** 2 == sorted_a[0] ** 2]
1000 loops, best of 3: 441 µs per loop

However, without a key the list comprehension is better:

In [34]: %timeit maxes(a)
10000 loops, best of 3: 98.1 µs per loop

In [35]: %timeit m = max(a); [s for s in a if s < m]
10000 loops, best of 3: 49.2 µs per loop

In [36]: %timeit sorted_a = sorted(a, reverse=True); [s for s in a if s == sorted_a[0]]
10000 loops, best of 3: 152 µs per loop

This is expected since the redundant key code is still being applied, if we were toremove that logic (replace calls to key(x) with just x) the function is again slightly faster:

In [37]: %timeit maxes2(a)
10000 loops, best of 3: 39.7 µs per loop
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • Your benchmarks are wrong. For the max+listcomp solutions you always used `<` instead of `==`, making the listcomp work more than it should. – Kelly Bundy Nov 14 '21 at 22:51
  • Your function requires the input to be subscriptable. I've posted a modified version that works in general, and mimics `max`'s behavior for empty input. – Joshua P. Swanson Sep 13 '22 at 20:51
0

Andy Hayden's solution is the most efficient, though it has two issues.

  1. It doesn't gracefully handle empty iterators.
  2. It requires the input to be subscriptable, so e.g. it will error when passed a generator.

Here's a modified version that fixes these issues:

def maxes(a, key=None):
    if key is None:
        key = lambda x: x

    a = iter(a)
    try:
        a0 = next(a)
        m, max_list = key(a0), [a0]
    except StopIteration:
        raise ValueError("maxes() arg is an empty sequence")

    for s in a:
        k = key(s)
        if k > m:
            m, max_list = k, [s]
        elif k == m:
            max_list.append(s)
    return m, max_list
  • "Andy Hayden's solution is the most efficient" - What makes you say that? – Kelly Bundy Sep 13 '22 at 21:26
  • Hmm, for empty input I'd expect an empty list as output, not an error. That empty list would correctly be a list of all max items. And the similar [statistics.multimode](https://docs.python.org/3/library/statistics.html#statistics.multimode) also does that. – Kelly Bundy Sep 13 '22 at 22:57
  • @KellyBundy For efficiency, his solution won most of his timing, and the overall approach can be done in a single pass without creating extra lists. My use case involves iterators with huge numbers of elements that should not be stored in memory all at once. – Joshua P. Swanson Sep 14 '22 at 09:58
  • @KellyBundy For the empty input case, that was my first thought, but Andy Hayden chose to return both the maximum and the list at which it was achieved. One could return 0 or None or something for the maximum along with an empty list, but that's non-canonical. `StopIteration` was the better approach. – Joshua P. Swanson Sep 14 '22 at 09:59
  • Ah, ok. If you return the max key as well, then yes, it's a problem. I just wouldn't have done that :-). The regular `max` function also doesn't. About efficiency: As I commented under Andy's answer last year, those benchmarks are wrong. They're also 9 years old and used Python 2, a lot might've changed in the meantime. I tested the first one again yesterday and Andy's was indeed still faster than the list comp one, but not that much faster anymore. And on other inputs I made up, it was slower. Might also waste a lot of *memory*, for example on `maxes(['abc'] * 999 + ['abcd'], key=len)`. – Kelly Bundy Sep 14 '22 at 15:05
  • @KellyBundy Obviously none of these approaches will universally be more efficient in all possible ways or use cases. My belief is that in the majority of use cases where efficiency actually matters, it's because the iteration is complex, so the one-pass solution is going to win. For small lists, comprehension could easily win, but that case seems unlikely to be a bottleneck. Your example of high memory usage does not seem likely to me in practice, though it is of course possible. – Joshua P. Swanson Sep 14 '22 at 21:25
-1

By definition, the max function returns the maximum value. It doesn't return the item, just the value which is unique (even if there are multiple item with the same max value). I suggest you use a sort algorithm and take whatever values you need.

In your example:

data = ['str','frt']
sorted(data,key=len, reverse=True)
result = [s for s in data if len(s)==len(data[0])]
Samy Arous
  • 6,794
  • 13
  • 20
  • 1
    sorting would take more time then max i think – gabber12 May 30 '12 at 20:08
  • Not much. In term of complexity, sorting is pseudo-linear where max is linear which is more or less equivalent. And once sorted, there are lots of cool things you can do on your data like find the second max, the min, binary search.... It depends on your needs of course. – Samy Arous May 30 '12 at 20:16
  • 3
    I added some timeit information, as suggested above sorting is significantly slower for large datasets. I don't understand what sorting achieves here over using max... You seem to be using it as a less efficient max, so -1. – Andy Hayden Oct 30 '13 at 22:31
  • 2
    Calling `sorted` doesn't do anything if you don't assign the result to a value. Did you mean to use `sort()`? – Kevin Apr 27 '18 at 18:05