I use a rich text editor in my android app, which works by parsing rich text to HTML.
But now, I want to fetch abstract containing plain text and some images from those HTMLs, so I decide to extract the plain text and images on server side with PHP. At the beginning, I'm trying to do it by regex (should be very complex), but it seems too hard for an embedded engineer.
Could anyone give me some suggestions?