I want to perform a lookup between a Map[String,List[scala.util.matching.Regex]]
with a dataframe column . if any of the List[scala.util.matching.Regex]
matches with the dataframe column values then it should return the key
from Map[String,List[scala.util.matching.Regex]]
Map[String,List[scala.util.matching.Regex]] = Map(m1 -> List(rule1, rule2), m2 -> List(rule3), m3 -> List(rule6)))
I want to iterate through the list of regex and match with the dataframe column value. it would be better if the regex match can be done in parallel rather than sequential
dataframe
+------------------------+
|desc |
+------------------------+
|STRING MATCHES SSS rule1|
|STRING MATCHES SSS rule1|
|STRING MATCHES SSS rule1|
|STRING MATCHES SSS rule2|
|STRING MATCHES SSS rule2|
|STRING MATCHES SSS rule3|
|STRING MATCHES SSS rule3|
|STRING MATCHES SSS rule6|
+------------------------+
O/P:
+-------------------+------------------------+
|merchant |desc |
+-------------------+------------------------+
|m1 |STRING MATCHES SSS rule1|
|m1 |STRING MATCHES SSS rule1|
|m1 |STRING MATCHES SSS rule1|
|m1 |STRING MATCHES SSS rule2|
|m1 |STRING MATCHES SSS rule2|
|m2 |STRING MATCHES SSS rule3|
|m2 |STRING MATCHES SSS rule3|
|m3 |STRING MATCHES SSS rule6|
+-------------------+------------------------+