I am working in a Jupyter Notebook with PySpark v2.3.4 which runs on Java 8, Python 3.6 (with py4j==0.10.7), and Scala 2.11, and I have a Scala case class that takes in a scala.util.matching.Regex
(scala doc) as an arg like so:
case class myClass(myString: String, myRegex: Regex)
I would like to construct an object from myClass
but I can't seem to figure out how to construct a scala.util.matching.Regex
object in a Python / PySpark environment. Below are a couple of my attempts/docs I've followed to create a Scala regex where sc
is my SparkContext.
sc._jvm.scala.util.matching.Regex("""(S|s)cala""")
- Error:
Constructor scala.util.matching.Regex([class java.lang.String]) does not exist
- This error message dumbfounds me because the Scala 2.11 docs clearly state that its constructor takes in a
java.lang.String
.
- Error:
sc._jvm.scala.util.matching.Regex("(S|s)cala")
- Same error as above
sc._jvm.scala.util.matching.Regex(r"(S|s)cala")
- Same error as above
sc._jvm.scala.util.matching.Regex("(S|s)cala".r)
(the way they do it in Scala)- Error: Python string does not have attribute "r"
sc._jvm.java.util.regex.Pattern.compile("(S|s)cala")
successfully creates a Java regex pattern -- and the scala doc clearly states that the Scala regex delegates to the Java regex package...
Any help/advice would be much appreciated! Thanks in advance!