First off: markup (HTML) and regex don't mix well. Be that as it may, you can remove spaces in between tags with the following regex quite easily:
$clean = preg_replace('/>\s+</', '><', $string);
This will remove spaces that are found in between tags if there's nothing else in between:
<p>Foobar <b>is</b> not a word <i>as such</i> <p>
will be "translated" into:
<p>Foobar <b>is</b> not a word <i>as such</i><p>
That's fine, but still, it'd be better (and safer) to parse, sanitize and then echo the markup using the DOMDocument
class. But before you start hacking away, and write thousands of lines of code to esnure you're processing valid markup, ask yourself this simple question:
How can I make sure that the markup I'm processing is well-formed, and valid to begin with?
Instead of writing code that works around bad markup, look into ways of making sure the data you're processing is of good quality to begin with.
Anyway, here's a simple example of how to use the DOMDocument
class:
$dom = new DOMDocument;
$dom->loadHTML($string);
echo $dom->saveHTML();//echoes sanitized markup
This assumes the $string
is a full DOM (including <html>
, doctype and all other tags that implies). If you don't have such a string, you'll have to use saveXML
:
echo $dom->getElementsByTagName('body')->item(0)->saveXML();
Where body
is the root node of your markup. See the docs for examples and details
If the string you have is what you've included in your question, then all spaces need to be removed. In that case, regex is just not necessary:
$string = '<tr>
<td>';
echo str_replace(' ', '', $string);//removes all spaces...
Ah well, browse through the documents of the DOMDocument
class, it's worth the effort. Honest :)