3

I have a Scala method that currently accepts input and matches it against a regex:

def valiateMap(mapDef : String) : String = {
    val mapRegex = "map<([int,string,date]),([int,string,date])>".r
    mapDef match {
        case mapRegex(keyGroup, valueGroup)   =>    "Yes, we have a match!"
        case _                                =>    "'Fraid not."
    }
}

So as you can see, passing in various values will have the following results:

  • validateMap("map<int,date>") -> "Yes, we have a match!"
  • validateMap("fizzbuzz") -> "'Fraid not."
  • validateMap("map<int,fizz>") -> "'Fraid not." (fizz is not valid for the regex)

Now I need to make this regex recursive, so that the "valueGroup" (2nd group) value can itself be a map, following the same regex as above. And that map could also have a map for its value, which would need to abide by the same regex. So on and so forth.

So I need to modify the regex (and/or the code) to handle input like:

  • validateMap("map<int,map<date,string>>") -> "Yes, we have a match!"
  • validateMap("map<int,map<int,map<int,string>>>") -> "Yes, we have a match!"
  • validateMap("map<int,map<fizz,date>>") -> "'Fraid not."

The kicker is that this needs to be (potentially) infinitely recursive, and handle any number of inner-nested maps.

Any ideas as to how I could accomplish this?

smeeb
  • 27,777
  • 57
  • 250
  • 447
  • 1
    Read RegEx match open tags except XHTML self-contained tags then decide it's a bad idea, and pick one of https://github.com/tpolecat/atto or https://github.com/lihaoyi/fastparse – Reactormonk Oct 06 '16 at 14:54
  • Uhhh, what?!?!? – smeeb Oct 06 '16 at 15:36
  • Right. somehow forgot the link: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags#1732454 – Reactormonk Oct 06 '16 at 17:04

2 Answers2

1

Try this:

def validateMap(mapDef : String) : String = {
  val terminal = "map<(int|string|date),(int|string|date)>".r
  val arbitrary = "map<(int|string|date),(.*)>".r
  mapDef match {
    case terminal(keyGroup, valueGroup)   =>    "Yes, we have a match!"
    case arbitrary(keyGroup, valueGroup)  =>    validateMap(valueGroup)
    case _                                =>    "'Fraid not."
  }
}
Nyavro
  • 8,806
  • 2
  • 26
  • 33
1

Regular expressions are a formalized language for matching certain kinds of patterns in an input string of symbols. One kinds of pattern that is not supported by regular expressions -- as a fundamental definition -- are recursively nested patterns.

The link that @Reactormonk supplied in their comment refers to the similar problem of trying to correctly parse HTML using regular expressions, which fails for the due to the same recursively nested pattern matching.

Since regular recordings are the wrong tool, you can instead consider parser combinators. Those will be more complex to specify but hands-free the more general case. Scala has an API for that built in, or you could use a third party library line fastparse.

hayden.sikh
  • 770
  • 4
  • 7