6

I want to parse this link :

<a href="http://www.google.fr">Link to google</a>

In order to get two results:

Link = "http://www.google.fr"
LinkName = "Link to google"

I really don't know how to do this, is there a library in Java to solve this problem ?

Thanks in advance,

Thordax
  • 1,673
  • 4
  • 28
  • 54

2 Answers2

2

Use jsoup parser:

example:

File input = new File("/tmp/input.html");
Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");

Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
    String linkHref = link.attr("href");
  String linkText = link.text();
}
Nurlan
  • 673
  • 4
  • 18
0

This will do.

public class Parse
{
  public static void main(String[] args)
  {
    String h = " <a href=\"http://www.google.fr\">Link to google</a>";
    int n = getIndexOf(h, '"', 0);

    String[] a = h.substring(n).split(">");
    String url = a[0].replaceAll("\"", "");
    String value = a[1].replaceAll("</a", "");

    System.out.println(url + " - " + value);
  }

  public static int getIndexOf(String str, char c, int n)
  {
    int pos = str.indexOf(c, 0);
    while (n-- > 0 && pos != -1)
    {
      pos = str.indexOf(c, pos + 1);
    }
    return pos;
  }
}
Bitmap
  • 12,402
  • 16
  • 64
  • 91