2

I come from Ruby and you can method chain very easily. Let's look at an example. If I want to select all even nums from a list and add 5 to it. I would do something like this in Ruby.

nums = [...]
nums.select {|x| x % 2 == 0 }.map { |x| x + 5 }

In Python that becomes

nums = [...]
list(map(lambda x: x + 5, filter(lambda x: x % 2 == 0, nums)))

The Python syntax looks horrible. I tried to Google and didn't really find any good answers. All I saw was how you can achieve something like this with custom objects but nothing to process lists this way. Am I missing something?

When in a debugging console, it used to be extremely helpful to get som ActiveRecord objects in an array and I could just chain methods to process the entities to debug things. With Python, it almost seems like too much work.

martineau
  • 119,623
  • 25
  • 170
  • 301
streetsoldier
  • 1,259
  • 1
  • 14
  • 32
  • 10
    `[x + 5 for x in nums if x % 2 == 0]` -- Not horrible if you write it the right way. Don't try and write Python like Ruby. – khelwood Jan 23 '22 at 00:37
  • what if you want to chain more? – streetsoldier Jan 23 '22 at 00:37
  • 1
    Admittedly, the ruby ​​code makes me a little jealous, both the chaining and the lambda syntax (curly braces rather not, they are hard to type on german keyboards). Popular libraries are trying to provide similar interfaces in python, like `pandas`, `numpy` or `pyspark`. – Michael Szczesny Jan 23 '22 at 00:52
  • Python has list comprehensions, but unlike Ruby not everything in Python is an object. As a result, some of the things you can do in Ruby are simply not feasible in Python, although there's usually a Pythonic alternative. – Todd A. Jacobs Jan 23 '22 at 01:41
  • @Todd A. Jacobs Can you give examples of what you mean what's not an object in Python? – Kelly Bundy Jan 23 '22 at 02:02
  • Number 2 is not an object in Python. In Ruby, it is an object. – streetsoldier Jan 23 '22 at 02:15
  • 1
    @streetsoldier Number 2 **is** an object in Python. – Kelly Bundy Jan 23 '22 at 02:18
  • Maybe interesting: [Recent discussion](https://mail.python.org/archives/list/python-ideas@python.org/message/E7HDYWYEAI27OZGY7SGM5GNDS7DM5MRX/) in Python-ideas about something like this. – Kelly Bundy Jan 23 '22 at 02:53
  • 3
    The example Ruby code can be written , `nums.filter_map{ |x| x + 5 if x.even? }`. – Cary Swoveland Jan 23 '22 at 04:18
  • 1
    @KellyBundy I suppose. But it's certainly not as integrated into the object system as Ruby's integers are. In Ruby, I can straight-up call `2.even?` to check if it's even, or `2.abs` to get its absolute value. Python integers, at a glance, don't seem to have a `__dict__` or a `__slots__`, while all user-defined classes have one of the two. So, at minimum, Python integers are more removed from "ordinary user-defined objects" than Ruby integers are. – Silvio Mayolo Jan 23 '22 at 04:21
  • @Silvio Mayolo Python's 2 has [quite a few methods](https://tio.run/##K6gsycjPM7YoKPr/v6AoM69EIyWzSMNIU/P/fwA) as well. – Kelly Bundy Jan 23 '22 at 05:23
  • 1
    @Silvio Mayolo And check out the bottom of [this answer](https://stackoverflow.com/a/865963/12671057) :-) – Kelly Bundy Jan 23 '22 at 09:19
  • What is the Ruby code `nums = [...]` supposed to do? – steenslag Jan 23 '22 at 11:04
  • @steenslag It indicates that `nums` is a list whose exact contents are not specified by this code. – khelwood Jan 23 '22 at 11:06
  • @khelwood Ah, thank you. I was confused because something like `[1..]` is valid Ruby syntax. – steenslag Jan 23 '22 at 11:25

2 Answers2

6

In Ruby, every enumerable object includes the Enumerable interface, which is why we get all of those helpful methods like you mention. But in Python, there's no common superclass for iterables. An iterable is literally defined as "a thing which supports __iter__", and while there is an abstract class called Iterable which pretends to be a superclass of all iterables, it doesn't actually provide any methods and it doesn't sit in the inheritance chain of all iterables (it overrides the behavior of isinstance and issubclass using the magic of dunder methods, the same way you can override + by writing __add__).

The Alakazam library implements exactly this feature. (Disclosure: I am the creator and maintainer of this library, but it does exactly what you're asking for, so I'll mention it here)

Alakazam provides the Alakazam class, which wraps any Python iterable and provides, as methods, all of the built-in Python sequence methods, all of the itertools module, and some other useful stream-oriented methods that aren't included in Python by default. Consider your example from above

nums.select {|x| x % 2 == 0 }.map { |x| x + 5 }

In Python, that looks like

list(map(lambda x: x + 5, filter(lambda x: x % 2 == 0, nums)))

With Alakazam, that looks like

zz.of(nums).filter(lambda x: x % 2 == 0).map(lambda x: x + 5).list()

or, using Alakazam's lambda syntax

zz.of(nums).filter(_1 % 2 == 0).map(_1 + 5).list()

Whenever reasonable, Alakazam's methods like filter and map are lazy to match Python's behavior, so we still need to write list() at the end to consume the iterable and produce a single list result.

khelwood
  • 55,782
  • 14
  • 81
  • 108
Silvio Mayolo
  • 62,821
  • 6
  • 74
  • 116
2

As noted in comments, this Ruby code:

nums = [...]
nums.select {|x| x % 2 == 0 }.map { |x| x + 5 }

Note: why not use #even??

nums = [...]
nums.select {|x| x.even? }.map { |x| x + 5 }

Or even:

nums = [...]
nums.select(&:even?).map { |x| x + 5 }

But nitpicks aside, this can be expressed in Python using a list comprehension, which is very clean.

nums = [...]
[x + 5 for x in nums if x % 2 == 0]

Now a list comprehension eagerly generates a full list. Imagine an original list like [1, 2, 3, 4, 5, 6, 7, 8]. The list comprehension would give us [2, 4, 6, 8]. The data set is trivial.

But imagine that nums is list(range(100_000_000)). Not a trivial data set. Applying this list comprehension to the whole thing will take a lot of time, even if we only need the first four values.

But a generator expression lets us lazily generate the values we need.

from itertools import islice

nums = range(100_000_000)
evens_plus_five = (x + 5 for x in nums if x % 2 == 0)

list(islice(evens_plus_five, 0, 5, 1))

As suggested in comments, this lazy evaluation advantage on large data sets can be gained in Ruby quite readily using #lazy and ranges.

nums = (1..100_000_000)
nums.lazy.select(&:even?).map { |x| x + 5 }.take(5).to_a

And if you're using Ruby 3, let's make that block even cleaner.

nums = (1..100_000_000)
nums.lazy.select(&:even?).map { _1 + 5 }.take(5).to_a
Chris
  • 26,361
  • 5
  • 21
  • 42
  • What if I want to filter for "divisible by 3" after the "add 5" transformation? Can you still make that very clean? – Kelly Bundy Jan 23 '22 at 07:06
  • Btw a bit odd that you wrote a list comp only to then complain about it :-). Why didn't you write a generator in the first place? The Ruby code doesn't produce an array, either, does it? But the equivalent of a Python generator? You make it sound like Python's generators have those advantages over Ruby... Do they? – Kelly Bundy Jan 23 '22 at 07:11
  • `[x + 5 for x in nums if x % 2 == 0 and (x + 5) % 3 == 0]` – Chris Jan 23 '22 at 07:12
  • Not a complaint about list comprehensions. Adding an explanation about generator expressions and how they can benefit you when dealing with large datasets. – Chris Jan 23 '22 at 07:13
  • That duplicates the `x + 5`. So I guess you acknowledge that you can't still make it very clean :-) – Kelly Bundy Jan 23 '22 at 07:14
  • Try running `(1..100_000_000).select(&:even?)` in irb. How long does it take? It takes a long time because these methods are eager in their evaluation. It's going to generate a complete Array of results. – Chris Jan 23 '22 at 07:17
  • We could avoid rewriting the `x + 5` with: `[y for y in (x + 5 for x in nums if x % 2 == 0) if y % 3 == 0]` – Chris Jan 23 '22 at 07:20
  • Ah, sorry, I misremembered. Thought these methods automatically result in enumerators. – Kelly Bundy Jan 23 '22 at 07:22
  • It'd be convenient in some cases! – Chris Jan 23 '22 at 07:26
  • 1
    That now reads inside out wildly zigzagging. You need to search for the start, you'll find it in the middle. Then the first step (%2) is on the right, the next step (+5) is on the left, then the next is all the way on the right, and then the result value all the way on the left. Ouff. What a mess, compared to Ruby's simple left-to-right layout with nicely separated blocks. – Kelly Bundy Jan 23 '22 at 07:28
  • 1
    `(1..100_000_000).lazy.select(&:even?)` is very fast, so maybe talk about that as well then :-) – Kelly Bundy Jan 23 '22 at 07:32
  • 1
    Please note that I'm not actually disagreeing with you on style. But I feel the list comprehension/generator expression beats the shown nested filter and map with lambdas. Also for the reasons you mentioned, I prefer the version that duplicates `x + 5`. – Chris Jan 23 '22 at 07:35
  • 1
    Yeah, ok :-). I might btw use `[x for x in nums if x % 2 == 0 for x in [x + 5] if x % 3 == 0]`, that at least jumps left only once, for the result value, and since all list comps do that, a reader expects it. Also, it's probably faster, at least since Python 3.9 (where that single-value list idiom was optimized to be a simple assignment rather than an iteration (at least in CPython, no idea about others)). – Kelly Bundy Jan 23 '22 at 07:44
  • Updated with your excellent suggestions. – Chris Jan 23 '22 at 07:45
  • +1 Yep. This is the official Python recommendation: favor list comprehensions whenever possible. While I strongly disagree with it, it is the party line, so it's Python I disagree with, not this answerer. – Silvio Mayolo Jan 23 '22 at 21:27