0

I'm using java. And i want to find then replace hyperlink and anchor text of tag <a> html. I knew i must use: replace() method but i'm pretty bad about regex. An example:

<a href="http://example.com">anchor text 1</a>

will be replaced by:

<a href="http://anotherweb.com">anchor text 2</a>

Could you show my the regex for that purpose? Thanks a lot.

Lost Heaven 0809
  • 396
  • 2
  • 7
  • 23
  • 2
    Don't use regex, use `HTML` [parser](http://stackoverflow.com/questions/2168610/which-html-parser-is-best). – Maroun Oct 14 '13 at 07:57
  • Is it a specific link and text that you want to replace or is it that you want to replace all links and all text to `http://anotherweb.com` and `anchor text 2`? – Jerry Oct 14 '13 at 08:13
  • I want to replace all links and all text in the text. – Lost Heaven 0809 Oct 14 '13 at 08:27

2 Answers2

2

Don't use regex for this task. You should use some HTML parser like Jsoup:

String str = "<a href='http://example.com'>anchor text 1</a>";

Document doc = Jsoup.parse(str);
str = doc.select("a[href]").attr("href", "http://anotherweb.com").first().toString();

System.out.println(str);
Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
1

You could perhaps use a replaceAll with the regex:

<a href=\"[^\"]+\">[^<]+</a>

And replace with:

<a href=\"http://anotherweb.com\">anchor text 2</a>

[^\"]+ and [^<]+ are negated class and will match all characters except " and < respectively.

Jerry
  • 70,495
  • 13
  • 100
  • 144