0

I have a HTML content in the form of String. There are many hyper links in the string. How can I remove only first link in the string? Please guide me.

String html = "abcdef<a href=some dynamic url>link1</a>ghijkl<a href=some url>link2</a>mnopq<a href=some url>link3</a>";

I want to remove the "link1" along with reference url from above string.

user1670443
  • 461
  • 2
  • 6
  • 17
  • 2
    Use [HTML parser](http://nekohtml.sourceforge.net/). – Maroun Oct 21 '13 at 09:05
  • If the problem is the 'first occurrence' part - [replaceFirst](http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replaceFirst(java.lang.String,%20java.lang.String)). But do show us an attempt at solving the problem. – Bernhard Barker Oct 21 '13 at 09:05
  • Read this http://stackoverflow.com/a/1732454/892914 – jnovacho Oct 21 '13 at 09:06

4 Answers4

2

I would do something like

String matchATag="<a[^>]*>([^<]+)</a>";
html=html.replaceFirst(matchATag,"");
Dhana Krishnasamy
  • 2,126
  • 17
  • 36
1

You can use a regular expression. Example:

html.replaceFirst("<a[^>]+>[^>]+</a>", "");
R. Oosterholt
  • 7,720
  • 2
  • 53
  • 77
0

You might try to match the link element with regex, but that's a recipe for problems.

You'd better get an HTML parser like NekoHTML, find the first link, and remove it.

Community
  • 1
  • 1
Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
0

For html processing i would suggest jsoup (http://jsoup.org/). You can also specify the replacement behaviour in this lib.

mkuff
  • 1,620
  • 6
  • 27
  • 39