4

Recently I am working on a android project. I am parsing data from wordpress api. But detail post content are in html formet. I have to remove html tags. Using Html.fromHtml().toString() java method I deleted all tags. But there are some image caption which I have to delete. For delete the caption I have to find tag class. So how can I delete this content using Html Class?

<p class="wp-caption-text">android m marshmallow</

EDIT :

Using regular Expression I solved My problem.

Insert Your specific Html in Regex and you will get your Regular Expression.

 yourHtml = yourHtml.replaceAll("Your_Regular_Expression","");
 yourHtml = Html.fromHtml(yourHtml).toString();
Yeahia2508
  • 7,526
  • 14
  • 42
  • 71

1 Answers1

2

If you want to get a match you can try this:

<(\w+).*?class="wp-caption-text".*?>[\s\S]*?<\/\1>

Regex101

I'd like to mention that this is not a perfect solution. Regular expressions are not very good at parsing html since the structures in that markup language are actually too complex to 100% be parseable by regular expressions. See here

Community
  • 1
  • 1
d0nut
  • 2,835
  • 1
  • 18
  • 23