2

I created two functions that returns a sorted a list. Both of them take as argument a list containg instances of Employee Class. The first sorts by name attribute and the second by age and both of them use lambda function

class Employee():

    allEmployees = []

    def __init__(self, name, age):
        self.name = name
        self.age = age
        Employee.allEmployees.append(self)


def sortEmployeesByName(some_list, name):
    return sorted(some_list, key=lambda employee: employee.name)

def sortEmployeesByAge(some_list, age):
    return sorted(some_list, key=lambda employee: employee.age)

How can I create only one function sortEmployees, where I pass the attribute as the second parameter and also use lambda function?

e.g.

def sortEmployess(some_list, attribute):
    return sorted(some_list, key=lambda employee: employee.attribute)
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
johnlock1
  • 57
  • 1
  • 8
  • Why not use a normal (`def`) function? – Klaus D. Jul 17 '18 at 18:24
  • @RafaelC Because `sorted` takes a function as a key, but the OP's method takes an attribute name, and wraps that in a function to pass to `sorted`. You can't just write `sorted(some_list, key="age")`. – abarnert Jul 17 '18 at 18:37
  • @abarnert Any reasons to want to pass a `str` rather than `lambda e: e.name`? – rafaelc Jul 17 '18 at 18:38
  • @RafaelC In general, no, but in specific cases, sure. Think of half the Pandas code out there, which passes around column names as names in various places. – abarnert Jul 17 '18 at 18:44
  • Does this answer your question? [Pythonic way to sorting list of namedtuples by field name](https://stackoverflow.com/questions/12087905/pythonic-way-to-sorting-list-of-namedtuples-by-field-name) – mkrieger1 Oct 19 '22 at 11:36

4 Answers4

6

you want operator.attrgetter, no need for lambdas. This should also perform better:

sorted(some_list, key=operator.attrgetter('name'))
nosklo
  • 217,122
  • 57
  • 293
  • 297
  • 2
    As an amendment, you can also do `from operator import attrgetter` and just reference it without the module as `sorted(some_list, key=attrgetter('name'))` – Sunny Patel Jul 17 '18 at 18:26
2

Using the operator.attrgeter. I added __repr__ method to see the example:

from operator import attrgetter

class Employee:

    allEmployees = []

    def __init__(self, name, age):
        self.name = name
        self.age = age
        Employee.allEmployees.append(self)

    def __repr__(self):
        return f'Employee({self.name}, {self.age})'

def sortEmployees(some_list, attribute):
    f = attrgetter(attribute)
    return sorted(some_list, key=f)

l = [Employee('John', 30),
Employee('Miranda', 20),
Employee('Paolo', 42)]

print(sortEmployees(Employee.allEmployees, 'name'))
print(sortEmployees(Employee.allEmployees, 'age'))

Prints:

[Employee(John, 30), Employee(Miranda, 20), Employee(Paolo, 42)]
[Employee(Miranda, 20), Employee(John, 30), Employee(Paolo, 42)]
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • Nice answer. Mind if I steal some of it? :) – PM 2Ring Jul 17 '18 at 18:42
  • @PM2Ring No problem :) (I wanted to make Employee subclass of collections.abc.MutableSequence too...) – Andrej Kesely Jul 17 '18 at 18:47
  • @Andrej Kesely Yeah I also have __repr__ in my code. I just stripped it down for stack overflow, so its easier to read. But I like the way you instantiate Employees, by using a list! – johnlock1 Jul 17 '18 at 18:52
2

Here's another version using operator.attrgetter. I think it makes sense here to give the Employee class a .sort classmethod. I've "borrowed" the __repr__ method and test data from Andrej Kesely. ;)

from operator import attrgetter

class Employee:
    allEmployees = []

    def __init__(self, name, age):
        self.name = name
        self.age = age
        Employee.allEmployees.append(self)

    def __repr__(self):
        return f'Employee({self.name}, {self.age})'

    @classmethod
    def sort(cls, attr):
        return sorted(cls.allEmployees, key=attrgetter(attr))   

Employee('John', 30)
Employee('Miranda', 20)
Employee('Paolo', 42)

print(Employee.sort('name'))
print(Employee.sort('age'))

output

[Employee(John, 30), Employee(Miranda, 20), Employee(Paolo, 42)]
[Employee(Miranda, 20), Employee(John, 30), Employee(Paolo, 42)]

A nice thing about operator.attrgetter is that we can pass it multiple attributes and it will return a tuple of attributes. We can use this to sort by multiple attributes in a single pass. But we need to modify the .sort method slightly. The other code remains the same.

    @classmethod
    def sort(cls, *attrs):
        return sorted(cls.allEmployees, key=attrgetter(*attrs))


Employee('John', 30)
Employee('Miranda', 20)
Employee('Paolo', 42)
Employee('John', 20)

print(Employee.sort('name'))
print(Employee.sort('age'))
print(Employee.sort('name', 'age'))

output

[Employee(John, 30), Employee(John, 20), Employee(Miranda, 20), Employee(Paolo, 42)]
[Employee(Miranda, 20), Employee(John, 20), Employee(John, 30), Employee(Paolo, 42)]
[Employee(John, 20), Employee(John, 30), Employee(Miranda, 20), Employee(Paolo, 42)]
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • Yeah I also think that class method makes more sense. I currently have the function inside another class, but I think I like you way more, cause is tidier. – johnlock1 Jul 17 '18 at 18:57
  • 1
    @johnlock1 Thanks. To me, it makes sense for the sort function to belong to the class it's sorting. BTW, I've added a little enhancement to my answer. You may not need it, but it may come in useful. – PM 2Ring Jul 17 '18 at 19:07
  • So it the last example in multiple attributes case, it sorts by both name AND age? I guess that the second attribute is used to sort objects could have the same sorted position, like 'John' objects. – johnlock1 Jul 17 '18 at 19:10
  • I now I wonder: can a have reverse sort in only one attribute? Cause that's exactly what I want to do further down the road. – johnlock1 Jul 17 '18 at 19:13
  • 1
    @johnlock1 Yes, it first compares by name, and if they match then it compares by age. That's just the standard way that Python compares tuples or lists. And if you think about it, that's what happens when you compare strings, too. – PM 2Ring Jul 17 '18 at 19:14
  • @johnlock1 Doing a reverse sort in one attribute is tricky! It can be done in some cases, but in general it's simpler and more readable to do it in two passes. – PM 2Ring Jul 17 '18 at 19:16
  • 1
    @johnlock1 But here's a brief illustration of the technique on a simple list of tuples: `seq = [('John', 30), ('Paolo', 42), ('Miranda', 20), ('John', 20)]; print(sorted(seq, key=lambda t: (t[0], -t[1])))`. – PM 2Ring Jul 17 '18 at 19:23
1

You probably don't want to do this, but I'll show you how anyway, using getattr:

getattr(object, name[, default])

Return the value of the named attribute of object. name must be a string. If the string is the name of one of the object’s attributes, the result is the value of that attribute. For example, getattr(x, 'foobar') is equivalent to x.foobar. If the named attribute does not exist, default is returned if provided, otherwise AttributeError is raised.

So:

def sortEmployees(some_list, age, key_attr):
    return sorted(some_list, key=lambda employee: getattr(employee, key_attr))

However, if the only thing you're using this for is a sort key, attrgetter in the stdlib wraps that up for you so you'd don't need to lambda up your own function:

def sortEmployees(some_list, age, key_attr):
    return sorted(some_list, key=operator.attrgetter(key_attr))

The reason you probably don't want to do this is that mixing up data and variable names is generally a bad idea, as explained by Ned Batchelder better than I could.

You end up with something that looks—to the human reader, and to your IDE, and to static checkers like linters and type checkers, and maybe even the optimizer—like dynamic code, even though what it actually does is purely static. You're getting all of the disadvantages of dynamic code without any of the benefits.

You don't even get shorter method calls:

sortEmployeesByName(some_list, name)
sortEmployees(some_list, name, "name")

However, the reason this is just "probably" rather than "definitely" is that there are cases where the same tradeoff goes the other way.

For example, if you had 15 of these attributes instead of 2, copying and pasting and editing the code 15 times would be a massive DRY violation. Or, imagine you were building the class or its instances dynamically, and the names weren't even known until runtime.

Of course you could write code that dynamically generates the methods at class or instance creation time, so they can then be used statically by client code. And this is a great pattern (used in various places in the stdlib). But for a dead simple case, it may be overcomplicating things badly. (A typical reader can figure out what a getattr means more easily than figuring out a setattr plus a descriptor __get__ call to manually bind a method, obviously.) And it still won't help many static tools understand your type's methods.

In many such cases, the way to fix that is to stop having separate named attributes and instead have a single attribute that's a dict holding all the not-quite-attribute things. But again, that's just "many", not "all", and the tradeoff can go the other way. For example, an ORM class, or something that acts like a Pandas DataFrame, you'd expect to be able to access the attributes as attributes.

So, that's why the feature is there: because sometimes you need it. I don't think you do need it in this case, but it's a judgment call.

abarnert
  • 354,177
  • 51
  • 601
  • 671