1

I have a string which has the HTML for an HTML table. I want to extract the data from the table as a dimensional array. Something like:

$Data = Array ( [0]=> Array([0]=>'Name', [1]=>'Age', [2]=>'CGPA'), 
                [1]=> Array([0]=>'Bob', [1]=>'24', [2]=>'3'), 
                [2]=> Array([0]=>'Alice', [1]=>'23', [2]=>'2'), 
                [3]=>Array([0]=>'Amy', [1]=>'22', [2]=>'4') )

I tried a many methods but they kept giving me errors. Now I am working on using "simple_html_dom" and it seems easy enough to be understood. So I am going to use it.

I am trying to use the code given in the accepted answer of this question. But it is giving me Fatal error: Call to a member function find() on a non-object on line 34

I searched and found this solution, but when I put the check (commented out in the code given below), I get Parse error: syntax error, unexpected ''$html is empty!'' (T_CONSTANT_ENCAPSED_STRING) on line 35, I have no clue why it is empty! May be it is a string and not the object expected? But what do I do about it?

Code:-

<?php

require('simple_html_dom.php');

$html = 'Edit question</a></div></div><div class="content"><div class="formulation"><h4 class="accesshide">Question text</h4><input type="hidden" name="q18:1_:sequencecheck" value="1" /><div class="qtext"><table style="width: 454px; height: 269px;" border="1"><caption> </caption>
<tbody>
<tr>
<td>Name</td>
<td>Age</td>
<td>CGPA</td>
</tr>
<tr>
<td>Alice</td>
<td>24</td>
<td>4</td>
</tr>
<tr>
<td>Bob</td>
<td>14</td>
<td>3</td>
</tr>
<tr>
<td>Amy</td>
<td>33</td>
<td>2</td>
</tr>
</tbody>
</table>
<p> </p>
<p>Blah BlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlah BlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlah?</p></div><div class="ablock"><div class="prompt">Select one:</div><div class="answer"><div class="r0"><input type="radio" name="q18:1_answer" value="0" id="q18:1_answer0" /><label for="q18:1_answer0">a. [1]ir[/1][2]34[/2]</label> </div>';

//if (!empty($html)) {
    // get the table. Maybe there's just one, in which case just 'table' will do
    $table = $html->find('table');
//} else {die '$html is empty!';}

// initialize empty array to store the data array from each row, that is the array containing the rows (that is entire <tr> tag).
$rowData = array();

// loop over rows
foreach($table->find('tr') as $row) {

    // initialize array to store the cell data from each row, that is the arrays containing data from <td> tags 
    $cellData = array();
    foreach($row->find('td.text') as $cell) {

        // push the cell's text to the array
        $cellData[] = $cell->innertext;
    }

    // push the row's data array to the 'big' array
    $rowData[] = $rowData;
}
print_r($rowData);
Community
  • 1
  • 1
Solace
  • 8,612
  • 22
  • 95
  • 183

1 Answers1

2

You can just point it directly on the table row. Example:

$html_string = 'Edit question</a></div></div><div class="content"><div class="formulation"><h4 class="accesshide">Question text</h4><input type="hidden" name="q18:1_:sequencecheck" value="1" /><div class="qtext"><table style="width: 454px; height: 269px;" border="1"><caption> </caption><tbody><tr><td>Name</td><td>Age</td><td>CGPA</td></tr><tr><td>Alice</td><td>24</td><td>4</td></tr><tr><td>Bob</td><td>14</td><td>3</td></tr><tr><td>Amy</td><td>33</td><td>2</td></tr></tbody></table><p> </p><p>Blah BlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlah BlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlahBlah?</p></div><div class="ablock"><div class="prompt">Select one:</div><div class="answer"><div class="r0"><input type="radio" name="q18:1_answer" value="0" id="q18:1_answer0" /><label for="q18:1_answer0">a. [1]ir[/1][2]34[/2]</label> </div>';
$html = str_get_html($html_string); // load the string
$rowData = array();
foreach($html->find('table tr') as $row_key => $row) { // load each row
    foreach($row->children() as $td) { // for every td
        $rowData[$row_key][] = $td->innertext; // push the each td in that row
    }
}

echo '<pre>';
print_r($rowData);

Should Output like this:

Array
(
    [0] => Array
    (
        [0] => Name
        [1] => Age
        [2] => CGPA
    )

    [1] => Array
    (
        [0] => Alice
        [1] => 24
        [2] => 4
    )

    [2] => Array
    (
        [0] => Bob
        [1] => 14
        [2] => 3
    )

    [3] => Array
    (
        [0] => Amy
        [1] => 33
        [2] => 2
    )
)

Explanation on your code:

$table = $html->find('table');

You can't call out ->find yet since there is no SimpleHTMLDOM object initialized. You need str_get_html() or file_get_html() first.

Kevin
  • 41,694
  • 12
  • 53
  • 70
  • 1
    @Zarah sure no prob. that's just a pre formatted tag `
    `, so that the output that i presented will print out nicely, try to test it. first remove the `
    ` and then try to add it again, you'll see the difference
    – Kevin Aug 11 '14 at 03:05