Is there a way, using Regex or other PHP functions, to extract all html text to a PHP array?
For example, I have this piece of code:
Example 1:
<div class="user" ><?= $username ?></div>
<table>
<tr>
<td>Cell 1</td>
<td>Cell 2</td>
</tr>
</table>
<span>Lorem ipsum <b>dolor</b> sit amet</span>
Lorem ipsum dolor sit amet <a href="www.example.com">Lorem</a>
Dolor site amet at date <?php echo date('Y-m-d'); ?> example
And I need some way to insert it in a form that will output an array like this:
Array(
[0] => "Cell 1"
[1] => "Cell 2"
[2] => "Lorem ipsum <b>dolor</b> sit amet"
[3] => "Lorem ipsum dolor sit amet "
[4] => "Lorem"
[5] => "Dolor site amet at date "
[6] => " example"
)
But make exceptions for text decoration tags like <u> <b> <i>
.
I tried using strip_tags
with the mentioned exceptions but it is inconsistent and often it only returns the first string ignoring the rest.
UPDATE
This regex (?<=>)\s*(?=<)|(?<=>)\n*([^<]+)
is almost what I asked for, there are only a few occurrences that it is letting escape.
When it finds script
tags it returns waht is between them:
<script type="text/javascript">
tipoProd = 'Squares';
</script>
Returns:
tipoProd = 'Squares';
And when it finds the line below:
<div class="content section" style="padding: 40px 0px; display: <?= $dev?'none':'block'?>; text-align:center" id="selectOptions">
Retunrs everything after PHP close tag:
; text-align:center" id="selectOptions">
How can I add this to the regex?