I don't understand how to use pattern matching for two or more regular expressions. For instance, I wrote the following program:
import scala.io.Source.{fromInputStream}
import java.io._
import java.net._
object craw
{
def main(args: Array[String])
{
val url=new URL("http://contentexplore.com/iphone-6-amazing-looks/")
val content=fromInputStream(url.openStream).getLines.mkString("\n")
val x="<a href=(\"[^\"]*\")[^<]".r.
findAllIn(content).
toList.
map(x=>x.substring(16,x.length()-2)).
mkString("").
split("/").
mkString("").
split(".com").
mkString("").
split("www.").
mkString("").
split(".html").
toList
print(x)
}
}
The above reads in all the anchor tags.
import scala.io.Source.{fromInputStream}
import java.io._
import java.net._
object new1
{
def main(args: Array[String])
{
val url=new URL("http://contentexplore.com/iphone-6-amazing-looks/")
val content=fromInputStream(url.openStream).getLines.mkString("\n")
val x="<p>.*?</p>".r.
findAllIn(content).
toList.
map(x=>x.substring(3,x.length()-4)).
mkString("").
split("</strong>").
mkString("").
split("</em>").
mkString("").
split(";").
mkString("").
split("<em>").
mkString("").
split("<strong>").
mkString("").
split(" ").
toList
print(x)
}
}
The above reads in all the paragraph tags.
I want to combine these two regular expressions into a single program, using pattern matching. Can guide me regarding how to use more than two regular expressions?
NOTE This question has to do with the combining regular expressions, and not with how to efficiently parse HTML.