** >> Please see Update near the bottom**
I am having to deal with a large amount of imported HTML code that is poorly formatted.
I have around 200 similar (but not identical) instances of the code, and each instance includes a specific set of <img>
tags. In some instances, the <img>
tags run from one to the next, with no line breaks in between. In other instances there are line breaks in the code, and these result in <br>
tags being inserted into the final code sent to the browser.
This will make more sense once I illustrate what I mean:
Example #1: There are no breaks between the <img>
tags...
<table align="center" border="0px"> <tbody><tr> <td> <img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/CustomerSatisfaction.png" alt="100% Customer Satisfaction" height="60" align="middle" width="140"> <img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/PaypalVerified.png" alt="Paypal Verified" height="60" align="middle" width="140"> <img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/FastDelivery.png" alt="Fast Delivery" height="60" align="middle" width="140"> <img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/Recycled.png" alt="100% Recyled Pre-owned Products" height="60" align="middle" width="140"> <img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/TopSellerRated.png" alt="Top Seller Rated" height="60" align="middle" width="140"> <img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/PhoneSupport.png" alt="Phone Support" height="60" align="middle" width="140"> </td> </tr> </tbody></table>
Example #2: There are breaks between the <img>
tags...
<table align="center" border="0px">
<tbody><tr>
<td>
<img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/CustomerSatisfaction.png" alt="100% Customer Satisfaction" align="middle" height="60" width="140">
<img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/PaypalVerified.png" alt="Paypal Verified" align="middle" height="60" width="140">
<img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/FastDelivery.png" alt="Fast Delivery" align="middle" height="60" width="140">
<img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/Recycled.png" alt="100% Recyled Pre-owned Products" align="middle" height="60" width="140">
<img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/TopSellerRated.png" alt="Top Seller Rated" align="middle" height="60" width="140">
<img src="http://simplicitywebsitedesign.com/iOutlet/images/buttons/PhoneSupport.png" alt="Phone Support" align="middle" height="60" width="140">
</td>
</tr>
</tbody></table>
As mentioned, for reasons unknown to me, the Wordpress site on which this code is utilised throws in <br>
tags when code example #2 is parsed through to the browser.
That results in the images displaying as follows (on Firefox):
Code sample #1 displays link this:
I am thinking the best way to resolve this is do to a search/replace via MySQL on the DB, using a regular expression that will identify instances of code example #2 and make it like code example #1. In other words, the line breaks will be removed from between the relevant <img>
tags.
Two questions:
1) Is that in fact the best way to go about this, or is there a potentially better way?
2) If that is a valid and suitable way to do it, would you suggest a suitable regular expression.
(With question 2, I am not sure what to suggest as the correct regex engine. This regex will be parsed within MySQL, using the Mac app Sequel Pro.app (http://www.sequelpro.com/).
My take on the possible Regex logic
My guess is that we need to:
1) Find instance of <table...> ... </table>
2) Find instances of </img>
(soft line break) <img ...>
within code identified by #1 above
3) Remove (soft line break)
There is one other <table> ... </table>
set within the code that will be searched. There is only one <img>
within that instance. There are exactly 6 <img>
instances within the <table> ... </table>
Update, taking comments into account
It has been suggested that I use the flex
CSS display
attribute, and apply it to the table row. I've done that, and it works well. I am a little concerned about compatibility on older browsers, as I gather it's a relatively recent CSS addition.
I do, however, still need to do a search/replace to locate the correct <table>
in the HTML.
In most of the HTML instances, there are two instances of <table> ... </table>
. So I suspect the regex would need to do a negative forward check for something like /stars/
which exists in a URL that's in the <table>
instance I don't want modified. Then it would be a matter of replacing <table>
with <table id="green-icons">
Thanks.
Jonathan
P.S. I am aware there is a LOT of contention around whether or not regex is a valid way to make changes to HTML. As this is a relatively fixed and known set of HTML, I suspect it'll be okay. But I am also open to other suggestions.