-1

I've got string <strong>Foo</strong>. I want to delete HTML tags from this string even with it's content. In this example expression must return "" (empty string). How should I do this?

Richard Sitze
  • 8,262
  • 3
  • 36
  • 48
Tony
  • 3,605
  • 14
  • 52
  • 84

2 Answers2

2

If the html you're trying to remove wouldn't have any nested html tags; here's a simple regex based solution. You can assign tag name to tag for convenience and the regex would adjust accordingly.

String tag = "strong";
String str = "This is <strong>Foo</strong>Bar.";

String regex = "<\\s*" + tag + "[^>]*>[^<]*</\\s*" + tag + "\\s*>";

System.out.println(str.replaceAll(regex, "")); // This is Bar.

The regex accommodates for any extra tag attributes like <strong class="bold"> etc. but could break if and is updated to take care of slightly ill-formatted html like unnecessary white spaces or new lines here and there.

Ravi K Thapliyal
  • 51,095
  • 9
  • 76
  • 89
0

Since you are claiming that you don't have nested tags you can try using "<([^>]+)>.*?</\\1>

String data = "bar<strong>foo</strong>yyy<strong>zzz</strong>";
System.out.println(data.replaceAll("<([^>]+)>.*?</\\1>", ""));

ouptut

baryyy 
Pshemo
  • 122,468
  • 25
  • 185
  • 269