4

I want to do something like this

def get_count(string)
 sentence.split(' ').count
end

I think there's might be a better way, string may have built-in method to do this.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
mko
  • 21,334
  • 49
  • 130
  • 191
  • String doesn't have a built-in method to do it because what you call a "word" might not agree with what someone else calls one. For instance, "one-way" is a compound word. Do you want to count it as one, or two words? Your definition might not include numerics inside a word, but the regex `\w` includes them; `[a-zA-Z0-9_]` is its definition. So, like most languages, the base class gives you the building blocks and you have to take it from there. – the Tin Man Jun 22 '11 at 19:59
  • BTW, your example code `sentence.split(' ').count` should probably be `string.split(' ').count`. – the Tin Man Jun 22 '11 at 20:00
  • @the Tin Man, you are right, this could bring up some confusing – mko Jul 09 '11 at 13:32

9 Answers9

5

I believe count is a function so you probably want to use length.

def get_count(string) 
    sentence.split(' ').length
end

Edit: If your string is really long creating an array from it with any splitting will need more memory so here's a faster way:

def get_count(string) 
    (0..(string.length-1)).inject(1){|m,e| m += string[e].chr == ' ' ? 1 : 0 }
end
Community
  • 1
  • 1
Candide
  • 30,469
  • 8
  • 53
  • 60
3

If the only word boundary is a single space, just count them.

puts "this sentence has five words".count(' ')+1 # => 5

If there are spaces, line endings, tabs , comma's followed by a space etc. between the words, then scanning for word boundaries is a possibility:

puts "this, is./tfour   words".scan(/\b/).size/2
steenslag
  • 79,051
  • 16
  • 138
  • 171
1

I know this is an old question, but this might help someone stumbling here. Countring words is a complicated problem. What is a "word"? Do numbers and special characters count as words? Etc...

I wrote the words_counted gem for this purpose. It's a highly flexible, customizable string analyser. You can ask it to analyse any string for word count, word occurrences, and exclude words/characters using regexp, strings, and arrays.

counter = WordsCounted::Counter.new("Hello World!", exclude: "World")
counter.word_count #=> 1
counted.words      #=> ["Hello"]

Etc...

The documentation and full source are on Github.

Mohamad
  • 34,731
  • 32
  • 140
  • 219
0

I'd rather check for word boundaries directly:

"Lorem Lorem Lorem".scan(/\w+/).size
=> 3

If you need to match rock-and-roll as one word, you could do like:

"Lorem Lorem Lorem rock-and-roll".scan(/[\w-]+/).size
=> 4
leviathan
  • 11,080
  • 5
  • 42
  • 40
0

using regular expression will also cover multi spaces:

sentence.split(/\S+/).size
Samuel Müller
  • 1,167
  • 1
  • 10
  • 12
0

String doesn't have anything pre-built to do what you wanted. You can define a method in your class or extend the String class itself for what you want to do:

def word_count( string )
  return 0 if string.empty?

  string.split.size
end
Syed Aslam
  • 8,707
  • 5
  • 40
  • 54
0

Regex split on any non-word character:

string.split(/\W+/).size

...although it makes apostrophe use count as two words, so depending on how small the margin of error needs to be, you might want to build your own regex expression.

Pavling
  • 3,933
  • 1
  • 22
  • 25
0

I recently found that String#count is faster than splitting up the string by over an order of magnitude.

Unfortunately, String#count only accepts a string, not a regular expression. Also, it would count two adjacent spaces as two things, rather than a single thing, and you'd have to handle other white space characters seperately.

Community
  • 1
  • 1
Andrew Grimm
  • 78,473
  • 57
  • 200
  • 338
0
p "  some word\nother\tword.word|word".strip.split(/\s+/).size #=> 4
Lri
  • 26,768
  • 8
  • 84
  • 82