1

I trying to read Username and Password from an Email using Java It is returning mail content in html format and I just wanted to extract Username and Password which is present under <td> tag. Below is my HTML code snippet -

<table width="200">
   <tbody>
     <tr>
        <td colspan="2">Your Account Details:</td>
     </tr>
      <tr>
        <td>EmailId:</td>
        <td><a class="moz-txt-link-abbreviated" href="mailto:jainish.m.kapadia@trimantra.net">jainish.m.kapadia@trimantra.net</a></td>
      </tr>
      <tr>
         <td>Password:</td>
         <td>C3mRXh+|n#1J</td>
      </tr>
  </tbody>
</table>

How do I achieve this?

halfer
  • 19,824
  • 17
  • 99
  • 186
NarendraR
  • 7,577
  • 10
  • 44
  • 82
  • `()(.*)(<\\/td>)` this will give you the content inside the td tags. you will need to use the 2nd group. you will not get the actual email, because this one is inside tag – XtremeBaumer Dec 13 '16 at 08:26

2 Answers2

4

Please don't try to parse HTML with RegEx, for a detailed answer on why you shouldn't try this see this SO answer.

You can use jsoup for parsing your HTML Strings like this:

String html = "<html><head><title>First parse</title></head>"
  + "<body><p>Parsed HTML into a doc.</p></body></html>";
Document doc = Jsoup.parse(html);

Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
  String linkHref = link.attr("href");
  String linkText = link.text();
}

jsoup also offers methods for hierarchical navigation like

siblingElements();
nextElementSibling();

and so on.

Community
  • 1
  • 1
mammago
  • 247
  • 2
  • 13
  • Thanks, my problem got resolved. Jsoup is the more convenient way rather then using Regex to find the matches. even though I found the regex only for password like `Password:<.*?td>([^<]+)(.+)<.*?td>` – NarendraR Dec 14 '16 at 05:00
1

You can use below code snippet:

String str = "your html";
Pattern pattern = Pattern.compile("(<td>(.*?)<\\/td>)");
Matcher matcher = pattern.matcher(str);

This will give you back all the <td> tag. Now you can loop through the matcher and get your required string.

SachinSarawgi
  • 2,632
  • 20
  • 28