0
Elements elements = doc.select("span.st"); 
for (Element e : elements) {        
out.println("<p>Text : " + e.text()+"</p>");
}

Element e contains text with some email id in it. How to extract the maild id from it. I have seen the Jsoup API doc which provides :matches(regex), but I didn't understand how to use it. I'm trying to use

^[a-zA-Z0-9_!#$%&’*+/=?`{|}~^.-]+@[a-zA-Z0-9.-]+$

which I found while googling.

Thank in advance for your help.

Alkis Kalogeris
  • 17,044
  • 15
  • 59
  • 113
maghub
  • 107
  • 3
  • 11

1 Answers1

1

:matches(regex) is useful if you want to find something based on a specified regex (e.g. find all nodes that contain email).

I think this is not what you want. Instead, you need to extract the email from e.text() using regex. In your case:

Elements elements = doc.select("span.st"); 
for (Element e : elements) {        
    out.println("<p>Text : " + e.text()+"</p>");
    out.println(extractEmail(e.text()));
}

// ...
public static String extractEmail(String str) {
   Matcher m = Pattern.compile("[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-    9-.]+").matcher(str);
   while (m.find()) {
       return m.group();
   }
   return null;
}
Community
  • 1
  • 1
Luís Soares
  • 5,726
  • 4
  • 39
  • 66
  • thank you @Luís Soares. i'm able to get the email with your solution but i don't understand the usage of :matches(regex). I tried jsoup api but i didn't get it. if possible give me an example. Thank you – maghub Feb 25 '15 at 11:51
  • read here: http://stackoverflow.com/a/23319612/819651 I don't have and haven't found any example but the process would be like: `doc.select("span.st:matches(@)")` would impose a stricter condition in your example; it would only select the nodes containing `@` inside (regardless of the level of nesting). – Luís Soares Feb 25 '15 at 12:02