2

I have a VERY long string of numbers (1000 characters). I would like to break it down into chucks of 5 and insert into an array arr.

str = "7316717653133062491922511967442657474206326239578318016 ..."

I tried each_slice but when I attempt to require 'enumerator' #=> irb says: false

str.each_slice(5).to_a

I would like the output to look like:

arr = [ "73167", "17653", "33062", ... ] 

How can this be attained?

MrPizzaFace
  • 7,807
  • 15
  • 79
  • 123
  • When require returns `false`, it means you already have the library loaded. If it could not load the library you would get an error. – Neil Slater Jan 13 '14 at 00:06
  • Good to know but why do I get this error `undefined method each_slice for #` – MrPizzaFace Jan 13 '14 at 00:08
  • 2
    `String` does not have an `each_slice` method. The method is defined for `Array` – Neil Slater Jan 13 '14 at 00:09
  • Possible duplicate of [What is the best way to chop a string into chunks of a given length in Ruby?](https://stackoverflow.com/questions/754407/what-is-the-best-way-to-chop-a-string-into-chunks-of-a-given-length-in-ruby) – outis Jan 11 '19 at 06:08
  • I would call 1KB "a VERY long string :) – akim Dec 27 '22 at 15:34

5 Answers5

11

The problem is that you're trying to perform an enumerable method on a non-enumerable object (a string). You can try using scan on the string to find groups of 5:

arr = str.scan /.{1,5}/

If you wanted to go the enumerable route, you could first break up the string into a character array, get groups of 5, then join them back into 5-character strings:

arr = str.chars.each_slice(5).map(&:join)
Dylan Markow
  • 123,080
  • 26
  • 284
  • 201
  • I like this, I really have to start thinking in regex's more :( – Senjai Jan 13 '14 at 00:08
  • I love the one line regex solution. Sweet! Thanks. – MrPizzaFace Jan 13 '14 at 00:11
  • Oh you beat me for 2 minutes + your answer is more polished, but why use the /.{1,5}/ notation when he could just use `\d{4}` or '\.{4}` if it's alphanumeric? – patm Jan 13 '14 at 00:12
  • 1
    @atmosx oops, I didn't notice that his string was exactly 1000 characters. Using `1,5` will allow the last captured chunk to be less than 5 characters if necessary. – Dylan Markow Jan 13 '14 at 13:40
  • Yes you are right, my approach is more *error prone* if the last chunk is not exactly five. Thanks for the explanation. – patm Jan 13 '14 at 14:22
  • This appears to be very costly when the string is large. – akim Dec 27 '22 at 14:32
5

Don't know why you're requiring enumerable, it's in ruby core and doesn't need to be required.

arr = []
until string.empty?
  arr << string.slice!(0..4)
end
Senjai
  • 1,811
  • 3
  • 20
  • 40
2

I would go using regexp. I think - without doing any testing - that it's a way faster solution:

Here's some code:

2.0.0-p247 :001 > string = '1231249081029381028401982301984870895710394871023857012378401928374102394871092384710398275018923501892347'
 => "1231249081029381028401982301984870895710394871023857012378401928374102394871092384710398275018923501892347" 
2.0.0-p247 :002 > string.scan(/\d{4}/)
 => ["1231", "2490", "8102", "9381", "0284", "0198", "2301", "9848", "7089", "5710", "3948", "7102", "3857", "0123", "7840", "1928", "3741", "0239", "4871", "0923", "8471", "0398", "2750", "1892", "3501", "8923"] 
2.0.0-p247 :003 > 

NOTE: I'm using 4 chars in my example not 5.. But you get the idea.

patm
  • 1,420
  • 14
  • 19
1

I would be careful using .chars because it has to allocate a separate array with the string's characters. In general I recommend using blocks if available or indexing since it will run faster and be more efficient memory-wise. In the past I've used a splitter with blocks like:

def splitter(input, chunk_size = 2, &block)
  (0..input.length/chunk_size - 1).each do |i|
    yield input.slice(i * chunk_size, chunk_size) if block_given?
  end
end

:008 > splitter("test\nwow") {|x| p x}
"te"
"st"
"\nw"
"ow"
 => 0..3
1

I personally followed the idea of user8556428, to avoid the costly intermediate values that most proposals introduce, and to avoid modifying the input string. And I want to be able to use it as a generator (for instance to use s.each_slice.with_index).

My use case is really about bytes, not characters. In the case of character-size, strscan is a fine solution.

class String
    # Slices of fixed byte-length.  May cut multi-byte characters.
    def each_slice(n = 1000, &block)
        return if self.empty?

        if block_given?
            last = (self.length - 1) / n
            (0 .. last).each do |i|
                yield self.slice(i * n, n)
            end
        else
            enum_for(__method__, n)
        end
    end
end
akim
  • 8,255
  • 3
  • 44
  • 60