0

I just learned from a book about regular expressions in the Ruby language. I did Google it, but still got confused about {x} and {x,y}.

The book says:

{x}→Match x occurrences of the preceding character.
{x,y}→Match at least x occurrences and at most y occurrences.

Can anyone explain this better, or provide some examples?

Todd A. Jacobs
  • 81,402
  • 15
  • 141
  • 199
Dreams
  • 8,288
  • 10
  • 45
  • 71
  • 3
    '.. the preceding *character*' is not entirely true. I would suggest 'the preceding *expression*', because this also works with grouped objects: /(test){3,5}/ and other constructions: /[[:ascii:]]{,10}/. – Jongware Jul 24 '13 at 09:17

4 Answers4

2

Sure, look at these examples:

http://rubular.com/r/sARHv0vf72

http://rubular.com/r/730Zo6rIls

/a{4}/

is the short version for:

/aaaa/

It says: Match exact 4 (consecutive) characters of 'a'.

where

/a{2,4}/

says: Match at least 2, and at most 4 characters of 'a'.

it will match

/aa/
/aaa/
/aaaa/

and it won't match

/a/
/aaaaa/
/xxx/
user1820801
  • 114
  • 6
nTraum
  • 1,426
  • 11
  • 14
  • This answer has been added to the [Stack Overflow Regular Expression FAQ](http://stackoverflow.com/a/22944075/2736496), under "Quantifiers". – aliteralmind Apr 10 '14 at 00:14
1

Limiting Repetition good online tutorial for this.

Arup Rakshit
  • 116,827
  • 30
  • 260
  • 317
1

I highly recommend regexbuddy.com and very briefly, the regex below does what you refer to:

[0-9]{3}|\w{3}

The [ ] characters indicate that you must match a number between 0 and 9. It can be anything, but the [ ] is literal match. The { } with a 3 inside means match sets of 3 numbers between 0 and 9. The | is an or statement. The \w, is short hand for any word character and once again the {3} returns only sets of 3.

If you go to RegexPal.com you can enter the code above and test it. I used the following data to test the expression:

909 steve kinzey

and the expression matched the 909, the 'ste', the 'kin' and the 'zey'. It did not match the 've' because it is only 2 word characters long and a word character does not span white space so it could not carry over to the second word.

Steve Kinzey
  • 373
  • 2
  • 9
1

Interval Expressions

GNU awk refers to these as "interval expressions" in the Regexp Operators section of its manual. It explains the expressions as follows:

{n}
{n,}
{n,m}
One or two numbers inside braces denote an interval expression. If there is one number in the braces, the preceding regexp is repeated n times. If there are two numbers separated by a comma, the preceding regexp is repeated n to m times. If there is one number followed by a comma, then the preceding regexp is repeated at least n times:

The manual also includes these reference examples:

wh{3}y
    Matches ‘whhhy’, but not ‘why’ or ‘whhhhy’.
wh{3,5}y
    Matches ‘whhhy’, ‘whhhhy’, or ‘whhhhhy’, only.
wh{2,}y
    Matches ‘whhy’ or ‘whhhy’, and so on.

See Also

Todd A. Jacobs
  • 81,402
  • 15
  • 141
  • 199