2

I am trying to reason about how for comprehension works, because it is doing something different from what I expect it to do. I read several answers, the most relevant of which is this one Scala "<-" for comprehension However, I am still perplexed.

The following code works as expected. It prints lines where the values matched by two different Regexes are not equal (one for the value in a session cookie and another for the value in the GET args, just to give context):

  file.getLines().foreach { line =>
      val whidSession: String = rWhidSession.findAllMatchIn(line) flatMap {m => m.group(1)} mkString ""
      val whidArg: String = rWhidArg.findAllMatchIn(line) flatMap {m => m.group(1)} mkString ""
      if(whidSession != whidArg) println(line)
  }

The following is the problematic code, which iterates on the letters within the matching strings, thus printing the line as many times as there are different letters in the two values:

  /**
   * This would compare letters, regardless of the use of mkString.. even without the flatMap step.
   */

  val whidTuples = for {
    line <- file.getLines().toList
    whidSession <- rWhidSession.findAllMatchIn(line) flatMap {m => m.group(1) mkString ""}
    whidArg <- rWhidEOL.findAllMatchIn(line) flatMap {m => m.group(1) mkString ""} if whidArg != whidSession 
  } yield line
Community
  • 1
  • 1
  • 1
    Not sure if this is relevant, but your second version appears (at a glance anyway) to have a difference from the first version: in the `whidSession <- ...` line, the `mkString ""` part is inside the {} instead of outside as it was in the first case. Bug? – Mike Morearty Nov 02 '14 at 17:42
  • In first snippet you use regex `rWhidArg` but in second `rWhidEOL`. Maybe your second regex is matching in every symbol? – Sergii Lagutin Nov 02 '14 at 19:19

1 Answers1

1

To check that corresponding matches are equal:

scala> val ss = "foo/foo" :: "bar/bar" :: "foo/bar" :: Nil
ss: List[String] = List(foo/foo, bar/bar, foo/bar)

scala> val ra = "(.*)/.*".r ; val rb = ".*/(.*)".r
ra: scala.util.matching.Regex = (.*)/.*
rb: scala.util.matching.Regex = .*/(.*)

scala> for (s <- ss; ra(x) = s; rb(y) = s if x != y) yield s
res0: List[String] = List(foo/bar)

but allow multiple matches on a line:

scala> val ss = "foo/foo" :: "bar/bar" :: "baz/baz foo/bar" :: Nil
ss: List[String] = List(foo/foo, bar/bar, baz/baz foo/bar)

this would still compare the first matches:

scala> val ra = """(\w*)/\w*""".r.unanchored ; val rb = """\w*/(\w*)""".r.unanchored
ra: scala.util.matching.UnanchoredRegex = (\w*)/\w*
rb: scala.util.matching.UnanchoredRegex = \w*/(\w*)

scala> for (s <- ss; ra(x) = s; rb(y) = s if x != y) yield s
res2: List[String] = List()

so compare all matches:

scala> val ra = """(\w*)/\w*""".r ; val rb = """\w*/(\w*)""".r
ra: scala.util.matching.Regex = (\w*)/\w*
rb: scala.util.matching.Regex = \w*/(\w*)

scala> for (s <- ss; ma <- ra findAllMatchIn s; mb <- rb findAllMatchIn s; ra(x) = ma; rb(y) = mb if x != y) yield s
res3: List[String] = List(baz/baz foo/bar, baz/baz foo/bar, baz/baz foo/bar)

or

scala> for (s <- ss; (ma, mb) <- (ra findAllMatchIn s) zip (rb findAllMatchIn s); ra(x) = ma; rb(y) = mb if x != y) yield s
res4: List[String] = List(baz/baz foo/bar)

scala> for (s <- ss; (ra(x), rb(y)) <- (ra findAllMatchIn s) zip (rb findAllMatchIn s) if x != y) yield s
res5: List[String] = List(baz/baz foo/bar)

where the match ra(x) = ma should not be re-evaluating the regex but just doing ma group 1.

som-snytt
  • 39,429
  • 2
  • 47
  • 129