Getting highest value from a dict of numbers as strings

Question

production = {
        'item1': '500',
        'item2': '10000',
    }

I'm trying to get the highest value from that dict, which would be 10000. However, I'm getting 500 as return of max(production.values()).

I believe it's like that because it's getting the highest value from string, not int (taking the lexicographical order of their codepoints).

Could someone help me with a solution. Thanks!

Not the best duplicate target. The duplicates list could be updated, for example, with this question: [Get max number out of strings](https://stackoverflow.com/q/53196684/7851470) — Georgy, May 22 '20 at 16:42

score 6 · Accepted Answer · answered May 18 '20 at 01:42

max() takes a key property which you can use to convert data for the calculation:

production = {
    'item1': '500',
    'item2': '10000',
}

# base max calculation on integer version of value
max(production.values(), key=int) # assuming all are integers
# '10000'

This will leave the values as you have them in the dictionary — so the returned value will remain a string.

It's also handy if you want the key as well:

max(production.items(), key=lambda pair: int(pair[1]))
# ('item2', '10000')

score 5 · Answer 2 · answered May 18 '20 at 01:50

Not the answer you should accept, but providing some background on performance:

from random import randint
from timeit import timeit

# generating 1,000 dictionaries of random keys with random string values
dicts = [{k: str(randint(1, 10000)) for k in range(10000)} for _ in range(1000)]


def all_max1():
    # the answer provided by @MarkMeyer
    return [max(d.values(), key=int) for d in dicts]


def all_max2():
    # the answer provided by @Marceline
    return [max(int(x) for x in d.values()) for d in dicts]


print(timeit(all_max1, number=10))
print(timeit(all_max2, number=10))

The answer @MarkMeyer provided is almost twice as fast as the answer provided by @Marceline, though both are technically correct.

Result:

7.4847795
11.341150599999999

The advice of @JHeron is good advice in that if you can avoid having strings in that position in the first place, using integers would be more efficient - but I assume your data comes in the form of strings.

However, if you need to operate on those values more than once (for more than just a single max value), you may consider first converting the original data and avoid multiple conversions later.

Mert Köklü · Answer 3 · 2020-05-18T01:42:41.297

1

You can convert them to integers and take max value out of them

max(int(x) for x in production.values())

edited May 18 '20 at 01:42

answered May 18 '20 at 01:40

Mert Köklü

2,183
2
16
20

The answer is correct, but doesn't need the square brackets (turning it into a list to pass to `max`), as `max` can also take it from a generator - just leave `[]` off. – Grismar May 18 '20 at 01:41

heron J · Answer 4 · 2020-05-18T01:58:41.990

1

You're getting this output because the values are Strings. To achieve a correct max() comparison, the values must be numbers.

production = {
        'item1': 500,
        'item2': 10000,
    }

edited May 18 '20 at 01:58

answered May 18 '20 at 01:41

heron J

322
1
11

Thanks! However, the values are inserted into the keys using OCR. I don't think there's a way to structure the dict directly using int values. – ankh May 18 '20 at 01:52

Getting highest value from a dict of numbers as strings

4 Answers4