String#split strange behavior

Question

I observed a strange behavior of the split method on a String.

"1..2".split('..')      # => ['1', '2']
"1..2".split('..', 2)   # => ['1', '2']

"..2".split('..')       # => ['', '2']
"..2".split('..', 2)    # => ['', '2']

Everything like expected, but now:

"1..".split('..')       # => ['1']
"1..".split('..', 2)    # => ['1', '']

I would expect the first to return the same that the second.

Does anyone have a good explanation, why "1..".split('..') returns an array with just one element? Or is it an inconsistency in Ruby? What do you think about that?

You can take a look: http://stackoverflow.com/questions/3568222/array-slicing-in-ruby-looking-for-explanation-for-illogical-behaviour-taken-fr — suvankar, Sep 16 '13 at 13:15

lurker · Accepted Answer · 2013-09-16T13:52:54.647

6

According to the Ruby String documentation for split:

If the limit parameter is omitted, trailing null fields are suppressed.

Regarding the limit parameter, the Ruby documentation isn't totally complete. Here is a little more detail:

If limit is positive, split returns at most that number of fields. The last element of the returned array is the "rest of the string", or a single null string ("") if there are fewer fields than limit and there's a trailing delimiter in the original string.

Examples:

"2_3_4_".split('_',3)
=> ["2", "3", "4_"]

"2_3_4_".split('_',4)
=> ["2", "3", "4", ""]

If limit is zero [not mentioned in the documentation], split appears to return all of the parsed fields, and no trailing null string ("") element if there is a trailing delimiter in the original string. I.e., it behaves as if limit were not present. (It may be implemented as a default value.)

Example:

"2_3_4_".split('_',0)
=> ["2", "3", "4"]

If limit is negative, split returns all of the parsed fields and a trailing null string element if there is a trailing delimiter in the original string.

Example:

"2_3_4".split('_',-2)
=> ["2", "3", "4"]

"2_3_4".split('_',-5)
=> ["2", "3", "4"]

"2_3_4_".split('_',-2)
=> ["2", "3", "4", ""]

"2_3_4_".split('_',-5)
=> ["2", "3", "4", ""]

It would seem that something a little more useful or interesting could have been done with the negative limit.

edited Sep 16 '13 at 13:52

answered Sep 16 '13 at 13:11

lurker

56,987
9
69
103

Interesting, but why "1..".split("..", 3) is ["1", ""] and not ["1", "", ""]? (ruby 1.9.2p290) – Matthias Sep 16 '13 at 13:13
Okay, but if `"..2".split('..')` is `['', '2']` and not `[nil, '2']`, why is `"1..".split('..')` then `['1', nil]` (and then `nil` ommitted) and not `['1', '']`? – spickermann Sep 16 '13 at 13:25
@spickermann that's the way they chose to implement it. The text doesn't say `nil` it says "null" which, generically, is "empty" (`''`). – lurker Sep 16 '13 at 13:28
@Mattherick that's an interesting implementation detail which is unspecified in the documentation. For a positive `limit`, it only indicates that "at most" that many fields will be supplied. An even more interesting question is: Why does a negative limit provide no limit to the number of fields selected and does not suppress the single null value? A limit of `0` already produces all of the fields with no trailing null, and a positive limit provides the single null if there's a trailing delimiter. So why not use a negative limit as a means of providing all of the nulls? A Ruby library quirk. :) – lurker Sep 16 '13 at 13:31

String#split strange behavior

1 Answers1