3

The chaining of each_slice and to_a confuses me. I know that each_slice is a member of Enumerable and therefore can be called on enumerable objects like arrays, and chars does return an array of characters.

I also know that each_slice will slice the array in groups of n elements, which is 2 in the below example. And if a block is not given to each_slice, then it returns an Enumerator object.

'186A08'.chars.each_slice(2).to_a

But why must we call to_a on the enumerator object if each_slice has already grouped the array by n elements? Why doesn't ruby just evaluate what the enumerator object is (which is a collection of n elements)?

Daniel Viglione
  • 8,014
  • 9
  • 67
  • 101
  • But are you sure that "if each_slice has already grouped the array by n elements" is actually the case? Ruby [generally tries to avoid expanding `Enumerator`s into arrays](https://stackoverflow.com/a/6493033/479863) unless it has to. Enumerators aren't really collections, they're more like pointers *into* collections of possibly unknown size. – mu is too short Sep 23 '18 at 04:01
  • The main need for `to_a` is when you want to chain the enumerator to an array method (that, like `sort`, for example, does not have an `Enumerable` counterpart). In that case you need to slip [Enumerable#to_a](http://ruby-doc.org/core-2.5.1/Enumerable.html#method-i-to_a) (or [Enumerable#entries](http://ruby-doc.org/core-2.5.1/Enumerable.html#method-i-entries)) between the enumerator and `Array` method. Look though [Array](http://ruby-doc.org/core-2.5.1/Array.html]) and you will see many instance methods that one might want to chain. Of course, you might also use `to_a` when debugging, – Cary Swoveland Sep 23 '18 at 06:11
  • 1
    Your question is unclear. What do you mean by "why must we call `to_a`"? Usually, you don't. If the client code does not require an `Array`, then an `Enumerator` should work just fine, since `Enumerator`s conform to the `Enumerable` protocol. Who told you that "we must call `to_a`"? Why do you believe that to be true? – Jörg W Mittag Sep 23 '18 at 09:52

1 Answers1

6

The purpose of enumerators is lazy evaluation. When you call each_slice, you get back an enumerator object. This object does not calculate the entire grouped array up front. Instead, it calculates each “slice” as it is needed. This helps save on memory, and also allows you quite a bit of flexibility in your code.

This stack overflow post has a lot of information in it that you’ll find useful:

What is the purpose of the Enumerator class in Ruby

To give you a cut and dry answer to your question “Why must I call to_a when...”, the answer is, it hasn’t. It hasn’t yet looped through the array at all. So far it’s just defined an object that says that when it goes though the array, you’re going to want elements two at a time. You then have the freedom to either force it to do the calculation on all elements in the enumerable (by calling to_a), or you could alternatively use next or each to go through and then stop partway through (maybe calculate only half of them as opposed to calculating all of them and throwing the second half away).

It’s similar to how the Range class does not build up the list of elements in the range. (1..100000) doesn’t make an array of 100000 numbers, but instead defines an object with a min and max and certain operations can be performed on that. For example (1..100000).cover?(5) doesn’t build a massive array to see if that number is in there, but instead just sees if 5 is greater than or equal to 1 and less than or equal to 100000.

The purpose of this all is performance and flexibility.

It may be worth considering whether your implementation actually needs to make an array up front, or whether you can actually keep your RAM consumption down a bit by iterating over the enumerator. (If your real world scenario is as simple as you described, an enumerator won’t help much, but if the array actually is large, an enumerator could help you a lot).

Nate
  • 2,364
  • 1
  • 10
  • 16
  • 1
    In addition to using next to iterate through an enumerator without loading the entire collection, you can also chain the lazy instance method of Enumerable to the enumerator object, like so: [].each_slice(2).lazy.select { |line| line.size == 5 }.first(5) wherein each_slice will return an enumerator which lazy can work on. – Daniel Viglione Sep 24 '18 at 01:09