2

You can use character classes to match a range of characters rather than an exact match like this:

> str = "Daniel"
> match = /A-Za-z/.match str
=> nil
> match = /[A-Za-z]/.match str
=> #<MatchData "D">

The first example returned nil because "Daniel" does not match exactly to "A-Za-z". But the second example uses a character class where '-' has special meaning where it matches a range. So the regex engine checks the string and stops at the first occurance of a match, which is 'D' in this case.

Since the + modifier matches one or more occurances, I can return the full string this way:

> match = /[A-Za-z]+/.match str
=> #<MatchData "Daniel">

match[0] will provide the full string "Daniel" because the regex matched one or more occurences of essentially every letter in the alphabet.

With that knowledge, then the engine should also be able to match ALL a's in a string. But it doesn't:

> str = "Daaniaal"
> match = /[a]+/.match str
=> #<MatchData "aa">

It seemed to stop after it matched the first two a's, even I used + modifier to match one or MORE occurances. Would have expected a result like "aaaa". How come this doesn't work?

JohnMerlino
  • 3,900
  • 4
  • 57
  • 89

2 Answers2

2

Each match is a discrete match - it doesn't glue the results together for you.

To get all results, use str.scan().

> str = "Daaniaal"
> str.scan /a+/
=> ["aa", "aa"]
alex
  • 479,566
  • 201
  • 878
  • 984
  • match = /[A-Za-z]+/.match str produced a discrete match with every occurance and it glued the results together. – JohnMerlino Aug 15 '14 at 00:04
  • @JohnMerlino It didn't, it just matched all characters in the string. There are two distinctive patterns in the string which matches your pattern. – alex Aug 15 '14 at 00:04
  • So this how the engine works all the time, it stops after it finds the first pattern? – JohnMerlino Aug 15 '14 at 00:07
  • @JohnMerlino Correct, when you use `match` that is. Note that you can specify a position in the string to start the search though: http://www.ruby-doc.org/core-2.1.2/Regexp.html#method-i-match – JKillian Aug 15 '14 at 00:09
  • I just wanted to add one more comment. You said that as soon as regex engine finds its first match, it doesn't continue. However, if we have the following string: str = "The moon is made of cheese". And we run this regex on it: match = /\s.+\s/.match str. It returns " moon is made of " instead of "The moon is made of cheese". It's as if the regex engine is aware of the second \s in the pattern even though it should never reach it, since .+ will be true until we hit a newline. – JohnMerlino Aug 15 '14 at 01:03
  • @JohnMerlino Regexs are *greedy* by default and will try and match as much as they can. To prevent that, you can place a `?` after the `+`, which will make the match ungreedy. It would then stop as soon as the regex's constraints are satisfied. – alex Aug 15 '14 at 02:15
0

It has to be continuous, so it would have to match "aaniaa", but of course it only matches the letter "a". The second "aa" is a different valid match.

String::scan will give you multiple results.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
JKillian
  • 18,061
  • 8
  • 41
  • 74