1

Hey in Ruby how do you split on multiple white space or a tab character? I tried this

2.4.0 :003 > a = "b\tc\td"
 => "b\tc\td" 
2.4.0 :005 > a.strip.split(/([[:space:]][[:space:]]+|\t)/)
 => ["b", "\t", "c", "\t", "d"]

but the tabs themselves are getting turned into tokens and that's not what I want. The above should return

["b", "c", "d"]
Dave
  • 15,639
  • 133
  • 442
  • 830

3 Answers3

2

It happens because the group you used is a capturing one. See split reference:

If pattern contains groups, the respective matches will be returned in the array as well.

Use a non-capturing group (used only for grouping patterns) to avoid adding matched strings into the resulting array:

a.strip.split(/(?:[[:space:]][[:space:]]+|\t)/)
                ^^
Graham
  • 7,431
  • 18
  • 59
  • 84
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

In this instance you can use a character class that includes both spaces and tabs in your regular expression:

"b\tc\td".split /[ \t]+/

If you want to split on any whitespace, you can also use the [\s]+ notation, which matches all whitespace characters.

coreyward
  • 77,547
  • 20
  • 137
  • 166
0

There are some easy approaches than accepted solution:

a.strip.split("\s")

or

a.split("\s")

'\s' will take care for multiple whitespaces characters.

for above case you can simply use:

a = "b\tc\td" 
a.split("\t")    #=> ["b", "c", "d"]

or for combination of multiple spaces and tabs

a.gsub("\t", " ").split("\s")     #=> ["b", "c", "d"]
chitresh
  • 316
  • 4
  • 11