0

Suppose I have a function designed to find the largest Y value in a list of dictionaries.

s1 = [
    {'x':10, 'y':8.04},
    {'x':8, 'y':6.95},
    {'x':13, 'y':7.58},
    {'x':9, 'y':8.81},
    {'x':11, 'y':8.33},
    {'x':14, 'y':9.96},
    {'x':6, 'y':7.24},
    {'x':4, 'y':4.26},
    {'x':12, 'y':10.84},
    {'x':7, 'y':4.82},
    {'x':5, 'y':5.68},
    ]

def range_y(list_of_dicts):
    y = lambda dict: dict['y']
    return y(min(list_of_dicts, key=y)), y(max(list_of_dicts, key=y))

range_y(s1)

This works and gives the intended result.

What I don't understand is the y before the (min(list_of_dicts, key=y). I know I can find the min and max with min(list_of_dicts, key=lambda d: d['y'])['y'] where the y parameter goes at the end (obviously swapping min for max).

Can someone explain to me what is happening in y(min(list_of_dicts, key=y)) with the y and the parenthetical?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
vashts85
  • 1,069
  • 3
  • 14
  • 28
  • 1
    Well, the `lambda` defines a function. And that function is assigned to `y`. So `y` points to the function defined by the `lambda`. It's as if you did `def y(dict): return dict['y']`. – ChrisGPT was on strike Dec 06 '20 at 23:34
  • 3
    Note that using `dict` as a variable shadows the `dict` builtin. It's not recommended. If you really can't think of a better name for that variable, consider using `dict_`. – ChrisGPT was on strike Dec 06 '20 at 23:35

3 Answers3

1

y is a function, where the function is defined by the lambda statement. The function accepts a dictionary as an argument, and returns the value at key 'y' in the dictionary.

min(list_of_dicts, key=y) returns the dictionary from the list with the smallest value under key 'y'

so putting it together, you get the value at key 'y' in the dictionary from the list with the smallest value under key 'y' of all dictionaries in the list

wjandrea
  • 28,235
  • 9
  • 60
  • 81
ccluff
  • 163
  • 5
1

I know I can find the min and max with min(list_of_dicts, key=lambda d: d['y'])['y'] ...

It's exactly the same as that, but the function y does the indexing. It's a bit shorter and DRYer to write it that way.


Note that named lambdas are generally bad practice, although this case isn't too bad. Best practice is to use operator.itemgetter:

y = operator.itemgetter('y')

However, you can do it better by using generator expressions to get the min/max y-values directly, instead of their containing dicts. Then the indexing only happens twice, which makes the function y practically pointless.

return min(d['y'] for d in list_of_dicts), max(d['y'] for d in list_of_dicts)
wjandrea
  • 28,235
  • 9
  • 60
  • 81
1

I'm declaring Abuse of Lambda. Whenever you see a lambda assigned to a variable, you need to ask why give a name to an anonymous function? And when that function lacks a clear name, why make this hard? The function could be rewritten as follows:

def get_y(d):
    """Return "y" item from collection `d`"""
    return d['y']

def range_y(list_of_dicts):
    return get_y(min(list_of_dicts, key=get_y)), get_y(max(list_of_dicts, key=get_y))

In fact, there is a function in the standard lib that does this, so this may be more expected

def range_y(list_of_dicts):
    get_y = operator.itemgetter("y")
    return get_y(min(list_of_dicts, key=get_y)), get_y(max(list_of_dicts, key=get_y))

But there is a more straight forward way to write this. itemgetter is useful as a key in the min/max searches, but only confuses things once you've selected the dicts.

def range_y(list_of_dicts):
    get_y = operator.itemgetter("y")
    return min(list_of_dicts, key=get_y)["y"], max(list_of_dicts, key=get_y)["y"]

But since all you care about is the min/max "y", extract those values and work with them from the beginning.

def range_y(list_of_dicts):
    y_vals = [d["y"] for d in list_of_dicts]
    return min(y_vals), max(y_vals)
tdelaney
  • 73,364
  • 6
  • 83
  • 116