What I'm trying to do is take a block of html, strip out all the html tags, and put each line of text into a PHP array.
I'm just trying it with one block to test (hence the WHERE ID = '2409'
in my mysql query.
The HTML portion for ID
2409
looks like this:
<table class="description-table">
<tbody>
<tr><td>Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370 Status Error Sed 9.9</td></tr>
<tr><td>Description</td></tr>
<tr><td></td>
<td><br>
<br><p></p><p></p>
<strong><br></strong> <strong><br></strong> <strong>Donec Rem </strong><br>
<br>
<strong>Animam Urgebat<br>
<br></strong> <strong><br>
<br>
Rerum Sed 8613 - 3669 8358 & 6699<br>
<br>
1.mE (magNA) QUO Ad Nominum Statum Massa<br>
ab SEM Autem Reddet Habitu Sit<br>
<br></strong> <strong> PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM</strong> <strong><br></strong> <strong><br></strong> <strong>Lius typi sit nec quo adversis cras ministri oppressa, versus class hic rem quos colubros ullo commune!economy!</strong><strong><br></strong><strong> ad Quisque Modeste</strong><strong> ac Rem Wisi</strong><strong> ex Hac Congue mus Leo</strong><strong> ab 7/92" Alias</strong><strong> ad 2/73" Adverso & Erat</strong><strong> me Personom Eget</strong><strong> ad Viribus Fuga Fuga</strong><strong> ab Louor-Sit Molles</strong><strong class="c2"> 3x Block-Off Plates</strong><strong class="c2"> ad Facunda</strong><strong class="c2"> ab Personas Diam<br>
NUNC<br>
ex Teniet te Palmam Eaque<br>
me Teniet in Versus Urna<br></strong> <strong><br></strong><br>
<strong class="c3">**CONDEMNENDUS REM CUM MAGNORUM**</strong><strong></strong><br>
</td>
</table>
And here's my PHP script designed to parse this
//connect to mysqli
$results = $mysqli->query("SELECT ID, post_content
FROM wp_posts'
WHERE ID = '2409';");
while($row = $results->fetch_array()) {
$htmlarray2 = preg_split('/<.+?>/', $row['post_content']);
$htmlarray = array_values(array_filter(array_map('trim', $htmlarray2)));
echo '<pre>';
print_r($htmlarray);
echo '</pre>';
. . .
}
This produces an output like this
Array
(
[0] => Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370 Status Error Sed 9.9
[1] => Donec Rem
[2] => Animam Urgebat
[3] => Rerum Sed 8613 - 3669 8358 & 6699
[4] => 1.mE (magNA) QUO Ad Nominum Statum Massa
[5] => ab SEM Autem Reddet Habitu Sit
[6] => PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM
[7] => Lius typi sit nec quo adversis cras ministri oppressa, versus class hic rem quos colubros ullo commune!
[8] => ad Quisque Modeste
[9] => ac Rem Wisi
[10] => ex Hac Congue mus Leo
[11] => ab 7/92" Alias
[12] => ad 2/73" Adverso & Erat
[13] => me Personom Eget
[14] => ad Viribus Fuga Fuga
[15] => ea Totam Poenam
[16] => ab Louor-Sit Molles
[17] => ad Facunda
[18] => ab Personas Diam
[19] => NUNC
[20] => ex Teniet te Palmam Eaque
[21] => me Teniet in Versus Urna
[22] => **CONDEMNENDUS REM CUM MAGNORUM**
)
This is okay, but now I'm having issue with removing the white-spaces before and after the strings in the array.
Let's take an example for the node 8
in the array
. . .
$arrayvalue = $htmlarray2['8'];
which echoes like this
ad Quisque Modeste
Now, what I'm trying to do is obviously trim each element of the array, but for testing I'm just working with this one variable $arrayvalue
.
My issue is that trim()
isn't working with this MySQL fetched variable. Meaning adding trim($arrayvalue);
has no affect and echoes out the same way as above.
I know this is something to do with me fetching the array via my query, because if I just test this variable out normally in it's own php script
$string = ' ad Quisque Modeste ';
echo trim($string);
It works fine, and echo outputs just simply ad Quisque Modeste
with the desired no white-spaces before or after the string.
Why isn't trim()
working in my while
loop?
What's the trick to trimming the leading and trailing white-spaces from the elements?
Edit: Here's my full while loop as requested. It's a bit different then the above example (I've been doing a lot of modifications trying to solve this myself so it's constantly changing), but here is what I have right now in full:
while($row = $results->fetch_array()) {
$id = $row['ID'];
echo 'ID: ' . $id;
echo '<br />';
//replace with white space
$converted = strtr($row['post_content'],array_flip(get_html_translation_table(HTML_ENTITIES, ENT_QUOTES)));
trim($converted, chr(0xC2).chr(0xA0));
//remove html elements
$htmlarray = preg_split('/<.+?>/', $converted);
// remove empty array elements and re-index array
$htmlarray2 = array_values(array_filter(array_map('trim', $htmlarray)));
// test by getting single value from array
$arrayvalue = $htmlarray2['9'];
// my attempt to trim string in while loop
trim($arrayvalue);
// doesn't trim
echo '<hr>' . $arrayvalue . '<hr>';
// put this here so I can see the full array
echo '<pre>';
print_r($htmlarray2);
echo '</pre>';
}
As requested, here is the results of var_export($row['post_content']);
'<table class="product-description-table">
<tbody>
<tr>
<td class="item" colspan="3">Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370 Status Error Sed 9.9</td>
</tr>
<tr>
<td class="title" colspan="3"></td>
</tr>
<tr>
<td class="content"><br>
<br>
<p class="c1"></p>
<p class="c1"></p>
<strong><br></strong> <strong><br></strong> <strong>Donec Rem </strong><br>
<br>
<strong>Animam Urgebat<br>
<br></strong> <strong><br>
<br>
Rerum Sed 8613 - 3669 8358 & 6699<br>
<br>
1.mE (magNA) QUO Ad Nominum Statum Massa<br>
ab SEM Autem Reddet Habitu Sit<br>
<br></strong> <strong> PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM</strong> <strong><br></strong> <strong><br></strong> <strong>Lius typi sit nec quo adversis cras ministri oppressa, versus class hic rem quos colubros ullo commune!economy!</strong><strong><br></strong><strong> ad Quisque Modeste</strong><strong> ac Rem Wisi</strong><strong> ex Hac Congue mus Leo</strong><strong> ab 7/92" Alias</strong><strong> ad 2/73" Adverso & Erat</strong><strong> me Personom Eget</strong><strong> ad Viribus Fuga Fuga</strong><strong> ab Louor-Sit Molles</strong><strong class="c2"> 3x Block-Off Plates</strong><strong class="c2"> ad Facunda</strong><strong class="c2"> ab Personas Diam<br>
NUNC<br>
ex Teniet te Palmam Eaque<br>
me Teniet in Versus Urna<br></strong> <strong><br></strong><br>
<strong class="c3">**CONDEMNENDUS REM CUM MAGNORUM**</strong><strong> </strong><br></td>
<td class="product-content-border"></td>
</tr>
<tr>
<td class="gallery" colspan="3">
<table>
<tbody>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td class="spacer" colspan="3"></td>
</tr>
<tr>
<td class="product-content-border"></td>
</tr>
</tbody>
</table>
<br>
<br>
<br>
<p class="c4"></p>'
Final Edit :):
Posted a solution below. Not going to accept my own answer.
If anyone familiar with regex can help explain the tribulation behind all this and why this regex formula : /[\s]+/mu
or rather $clean_htmlarray = preg_replace('/[\s]+/mu', ' ', $htmlarray);
fixed this issue I'll gladly accept that as a proper answer and explanation.