My sentence is say, "I want to remove this string so bad." I passed this text file as
text = sc.textFile(...)
and I want to filter out(i.e remove) the word "string" I noticed that in python, there is a "re" package. I tried doing
RDD.map(lambda x: x.replaceAll("<regular expression>", ""))
to filter out the "string" but seems like there is no such function in PySpark because it gave me an error.. How do I import "re" package? or is there any other function that I can use to remove/filter out certain string based on regular expression in PySpark?