73

In Ruby 1.8.6, I have an array of, say, 100,000 user ids, each of which is an int. I want to perform a block of code on these user ids but I want to do it in chunks. For example, I want to process them 100 at a time. How can I easily achieve this as simply as possible?

I could do something like the following, but probably there's an easier way:

a = Array.new
userids.each { |userid|
  a << userid
  if a.length == 100
    # Process chunk
    a = Array.new
  end
}
unless a.empty?
  # Process chunk
end
sepp2k
  • 363,768
  • 54
  • 674
  • 675
ChrisInEdmonton
  • 4,470
  • 5
  • 33
  • 48
  • possible duplicate of [Need to split arrays to sub arrays of specified size in Ruby](http://stackoverflow.com/questions/3864139/need-to-split-arrays-to-sub-arrays-of-specified-size-in-ruby) – Nakilon Dec 25 '10 at 12:00
  • @Nakilon: Isn't that question newer than this one? – Andrew Grimm Feb 09 '11 at 03:27
  • @Andrew Grimm, to decide which of two questions to close, I look not at date, but at quality of answers. I mean, I advise a person who looks here, to go there ) – Nakilon Feb 09 '11 at 11:23
  • That's like [this Jon Skeet fact](http://meta.stackexchange.com/questions/9134/jon-skeet-facts/9277#9277)! – Andrew Grimm Feb 09 '11 at 12:01

2 Answers2

139

Use each_slice:

require 'enumerator' # only needed in ruby 1.8.6 and before
userids.each_slice(100) do |a|
  # do something with a
end
sepp2k
  • 363,768
  • 54
  • 674
  • 675
  • 3
    Note that you actually have to explicitly "require 'enumerator'" for this to work: the method is not available in classes that mix in Enumerable, which initially led me to think this answer was wrong. Then I learned better. – Mike Woodhouse Aug 05 '09 at 09:53
  • 4
    Yes, you have to require 'enumerator' in 1.8.6 for this to work (which is why I did). In 1.8.7+ enumerator has been moved to core and you no longer have to require it. However doing so will not cause an error, but simply return false. So for compability reasons you should always require 'enumerator' when using methods from enumerator, so that the code will work with all versions of ruby. – sepp2k Aug 05 '09 at 10:30
  • 3
    @andorov Right. It's not needed in any Ruby version greater than 1.8.6 (as mentioned in my previous comment). – sepp2k Jun 05 '14 at 20:52
33

Rails has in_groups_of, which under the hood uses each_slice.

userids.in_groups_of(100){|group|
  //process group
}
wombleton
  • 8,336
  • 1
  • 28
  • 30
  • 3
    We don't use Rails. It doesn't scale sufficiently far for us; our databases are sharded across about 26 shards. Plus, we have a significant number of other database servers, probably another twenty or so, though these aren't sharded. Thanks for the suggestion, though, I'm sure that'll be useful for plenty of other people. – ChrisInEdmonton Aug 05 '09 at 14:46
  • 2
    Yeah, got that you weren't using rails which was why I linked through to the source so you could pull the method if you wanted it. – wombleton Aug 06 '09 at 02:10
  • And why I awarded you a +1. :) A good answer that didn't specifically work for me, but would for others. – ChrisInEdmonton Aug 06 '09 at 15:04
  • 1
    Thanks for this - I missed it when looking over the rails docs. Just to clarify on the last comment, this __is__ available in Rails 3.x ([docs](http://api.rubyonrails.org/classes/Array.html#method-i-in_groups)). Also, and while this is 3 years too late, you can include `active_support` in non-rails projects, since it is a gem all on it's own. – theTRON Jan 19 '12 at 02:04