1

I know basic way to extract string matching a regex in scala:

val p = ".*(\\d+ minutes).*".r
val p(m) = "last 12 minutes."

I can get "12 minutes" in m. But what if I want to match regex like \\d+ minute(s)?. For example, I could get "12 minutes" if input is "last 12 minutes.", and "12 minute" if input is "last 12 minute.".

".*(\\d+ minute(s)?).*".r doesn't work, maybe bracket is used to mark the matched regex we want, so more brackets appeared in regex doesn't work. And I know ".*(\\d+ minute[s]?).*".r could satisfy my request, but what if I want to match \\d+ minutes( left)? in which case I have to use bracket in the matched part?

K F
  • 645
  • 1
  • 6
  • 16

2 Answers2

2

You may use a ? modifier (s?) to make a single char optional or a non-capturing group with a ? quantifier to make a sequence of chars (here, (?:s)?) optional:

val p = ".*?(\\d+ minutes?).*".r

Or with a non-capturing group:

val p = ".*?(\\d+ minute(?:s)?).*".r
val p = ".*\\b(\\d+ minutes(?: left)?).*".r

Note that you must make the first .* lazy or you won't be able to get 12 into the capturing group. Or use a word boundary, ".*\\b(\\d+ minutes?).*".r, or make the regex unanchored and get rid of .* altogether:

val p = """(\d+\s+minutes?)""".r.unanchored
val s = "last 12 minutes."
val res = s match { 
    case p(m) => m
    case _ => ""
}
// => 12 minutes

See the Scala demo.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thanks a lot, "non-capturing group", that's what I want. Also thx about the advice for unanchored. – K F Feb 06 '18 at 09:56
1

The question mark ? will by default refer to the immediately preceding character. So the following should match minute or minutes:

val p = ".*(\\d+ minutes?).*".r
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360