0

Is it possible to convert just a selection of a HTML with multiple tables to JSON ?

I have this Table:

<div class="mon_title">2.11.2015 Montag</div>
    <table class="info" >
    <tr class="info"><th class="info" align="center" colspan="2">Nachrichten zum Tag</th></tr>
    <tr class='info'><td class='info' colspan="2"><b><u></u>   </b>
    ...
    </table>
    <p>
    <table class="mon_list" >

    ...
    </table>

And this PHP code to covert it into JSON:

function save_table_to_json ( $in_file, $out_file ) {
    $html = file_get_contents( $in_file );
    file_put_contents( $out_file, convert_table_to_json( $html ) );
}

function convert_table_to_json ( $html ) {
    $document = new DOMDocument();
    $document->loadHTML( $html );

    $obj = [];
    $jsonObj = [];
    $th = $document->getElementsByTagName('th');
    $td = $document->getElementsByTagName('td');
    $thNum = $th->length;
    $arrLength = $td->length;
    $rowIx = 0;

    for ( $i = 0 ; $i < $arrLength ; $i++){
        $head = $th->item( $i%$thNum )->textContent;
        $content = $td->item( $i )->textContent;
        $obj[ $head ] = $content;
        if( ($i+1) % $thNum === 0){ 
            $jsonObj[++$rowIx] = $obj;
            $obj = [];
        }
    }
    save_table_to_json( 'heute_S.htm', 'heute_S.json' );

What it does is takes the table class=info and the table class=mon_list and converts it to json.

Is there any way that it can just take the table class=mon_list?

Ultimater
  • 4,647
  • 2
  • 29
  • 43
Marius Schönefeld
  • 425
  • 2
  • 6
  • 21
  • Frankly I don't understand the question. Can you give a minimal example of what you get now and what do you want to get? – user4035 Nov 16 '15 at 22:21
  • If I understand your question well, I assume you need to get the **selection** from the HTML document and then parse it (probably using Javascript or with a PHP file). This question should be helpful for detecting the selected text: http://stackoverflow.com/questions/5083682/get-selected-html-in-browser-via-javascript – Alejandro Iván Nov 17 '15 at 00:50

2 Answers2

1

You can use XPath to search for the class, and then create a new DOM document that only contains the results of the XPath query. This is untested, but should get you on the right track.

It's also worth mentioning that you can use foreach to iterate over the node list.

$document = new DOMDocument();
$document->loadHTML( $html );

$xpath = new DomXPath($document);
$tables = $xpath->query("//*[contains(@class, 'mon_list')]");
$tableDom = new DomDocument();
$tableDom->appendChild($tableDom->importNode($tables->item(0), true));

$obj = [];
$jsonObj = [];
$th = $tableDom->getElementsByTagName('th');
$td = $tableDom->getElementsByTagName('td');
$thNum = $th->length;
$arrLength = $td->length;
$rowIx = 0;

for ( $i = 0 ; $i < $arrLength ; $i++){
    $head = $th->item( $i%$thNum )->textContent;
    $content = $td->item( $i )->textContent;
    $obj[ $head ] = $content;
    if( ($i+1) % $thNum === 0){ 
        $jsonObj[++$rowIx] = $obj;
        $obj = [];
    }
}
miken32
  • 42,008
  • 16
  • 111
  • 154
0

Another unrelated answer is to use getAttribute() to check the class name. Someone on a different answer has written a function for doing this:

function getElementsByClass(&$parentNode, $tagName, $className) {
    $nodes=array();

    $childNodeList = $parentNode->getElementsByTagName($tagName);
    for ($i = 0; $i < $childNodeList->length; $i++) {
        $temp = $childNodeList->item($i);
        if (stripos($temp->getAttribute('class'), $className) !== false) {
            $nodes[]=$temp;
        }
    }

    return $nodes;
}
Community
  • 1
  • 1
miken32
  • 42,008
  • 16
  • 111
  • 154