1

How do I go about removing the tabs, new lines, and whitespaces from this array?

array1 = ["E", "A", "C", "H", " ", "L", "I", "N", "E", " ", "E", "N", "D", "S", " ", "W", "I", "T", "H", " ", "A", " ", "A", "C", "C", "I", "D", "E", "N", "T", "A", "L", "L", "Y", " ", " ", "A", "D", "\"", "A", " ", "A", "C", "C", "I", "\n", "\""]

I have tried the following, and none of these seem to work properly.

array1.map!(&:strip)

array1.reject!(&:empty?)

array1.reject(&:empty?)

array1 - [""]

array1.delete_if {|x| x == " " } 
sawa
  • 165,429
  • 45
  • 277
  • 381
Ann Left
  • 31
  • 4

3 Answers3

4
array1 = ["E", " ", ":", "L", "É", "\t", "T",
          "-", "H", "\n", "\""]

array1.reject { |s| s.match? /\s/ }
  #=> ["E", ":", "L", "É", "T", "-", "H", "\""]

\s in a regular expression matches all whitespace characters, namely, spaces, tabs ("\t") newlines ("\n"), carriage returns ("\r") and formfeeds ("\f").

The latter two have their origins from the days when teletype machines were used, the carriage return being the movement of the printhead from the end to the beginning of the line and the formfeeds advancing the paper being printed one line.1

1 Microsoft Windows still recognizes carriage returns and formfeeds, thereby maintaining support for teletype machines. ¯\_(ツ)_/¯

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
2

You can use grep to select elements matching a pattern. That pattern can be a simple regexp like /\s/ which matches whitespace characters:

array1.grep(/\s/)
#=> [" ", " ", " ", " ", " ", " ", " ", " ", "\n"]

The result is an array with all elements containing at least one whitespace character.

There's also \S (uppercase) which matches non-whitespace characters:

array1.grep(/\S/)
#=> ["E", "A", "C", "H", "L", "I", "N", "E", "E", "N", "D", "S", "W",
#    "I", "T", "H", "A", "A", "C", "C", "I", "D", "E", "N", "T", "A",
#    "L", "L", "Y", "A", "D", "\"", "A", "A", "C", "C", "I", "\""]

And we have grep_v which is the inverted version of grep. This would be useful if you wanted to specify space, tab and newline explicitly:

array1.grep_v(/[ \t\n]/)
#=> ["E", "A", "C", "H", "L", "I", "N", "E", "E", "N", "D", "S", "W",
#    "I", "T", "H", "A", "A", "C", "C", "I", "D", "E", "N", "T", "A",
#    "L", "L", "Y", "A", "D", "\"", "A", "A", "C", "C", "I", "\""]
Stefan
  • 109,145
  • 14
  • 143
  • 218
  • This is the most suitable answer imo. – 3limin4t0r Feb 25 '19 at 11:12
  • @Johan, I'll second that, in part because of the he reference to `grep_v`, which I've not seen used before. Ann, please consider moving the greenie to this answer. – Cary Swoveland Feb 25 '19 at 16:19
  • Readers: repeat 100 times, "To select with a regex, think grep. To reject, grep_v." – Cary Swoveland Feb 25 '19 at 17:22
  • ^ The better mnemonic would be: Can I match elements using the case equality (`===`)? If the answer is *yes* use `grep`. Another example without regex could be: `[1, 'A', :b].grep(Integer) #=> [1]` or `(1..100).grep(95..150) #=> [95, 96, 97, 98, 99, 100]` – 3limin4t0r Feb 26 '19 at 11:18
0

In addition, just other possible variants:

array1 = [" ", "A", "\n", "\t", "B", "\r"]
array1.delete_if { |s| s.match? /\s/ }
#=> ["A", "B"]
array1 = [" ", "A", "\n", "\t", "B", "\r"]
array1.keep_if { |s| !s.match? /\s/ }
#=> ["A", "B"]
array1 = [" ", "A", "\n", "\t", "B", "\r"]
array1.select! { |s| !s.match? /\s/ }
#=> ["A", "B"]

Using match? rather than match is more preferable not only because we don’t use MatchData.

The point is that the benchmark shows that match? is almost 2 times faster.

This can be significant when working with large amounts of data.

mechnicov
  • 12,025
  • 4
  • 33
  • 56