I suggest that you move away from using regex expressions to parse (or manipulate) HTML, because it's not a good idea, and here's a great SO answer on why.
For example, by using Peter's approach (preg_match_all('~<img src="(.+?)" width="(.+?)">~is', $content, $return);
), you are assuming that all your images start with <img
, are followed by the src
, and then contain the width=
, all typed exactly like that and with those exact whitespace separations, and those particular quotes. That means that you will not capture any of these perfectly valid HTML images that you want to remove:
<img src='asd' width="123">
<img src="asd" width="123">
<img src="asd" class='abc' width="123">
<img src="asd" width = "123">
While it's of course perfectly possible to catch all these cases, do you really want to go through all that effort? Why reinvent the wheel when you can just parse the HTML with already-existing tools. Take a look at this other question.