1

Is there any possibility to convert an HTML table to JSON with PHP?

I have this JavaScript:

    <script>
(function() {
    var jsonArr = [];
    var obj = {};
    var rowIx = 0;
    var jsonObj = {};
    var thNum = document.getElementsByTagName('th').length;
    var arrLength = document.getElementsByTagName('td').length;

    for(i = 0; i < arrLength; i++){
        if(i%thNum === 0){
            obj = {};
        }
        var head = document.getElementsByTagName('th')[i%thNum].innerHTML;
        var content = document.getElementsByTagName('td')[i].innerHTML;
        obj[head] = content;
        if(i%thNum === 0){
            jsonObj[++rowIx] = obj
        }   
    }           

    var result = "<br>"+JSON.stringify({"Values": jsonObj});
    document.write(result);
})();
</script>

which uses the below HTML code:

<TABLE border="3" rules="all" bgcolor="#E7E7E7" cellpadding="1" cellspacing="1">
<TR>
<TH align=center><font size="3" face="Arial">Date</font></TH>
<TH align=center><font size="3" face="Arial"><B>Teacher</B></font></TH>
<TH align=center><font size="3" face="Arial">?</font></TH>
<TH align=center><font size="3" face="Arial">Hour</font></TH>
<TH align=center><font size="3" face="Arial">Subject</font></TH>
<TH align=center><font size="3" face="Arial">Class</font></TH>
<TH align=center><font size="3" face="Arial">Room</font></TH>
<TH align=center><font size="3" face="Arial">(Teacher)</font></TH>
<TH align=center><font size="3" face="Arial">(Room)</font></TH>
<TH align=center><font size="3" face="Arial">XYY</font></TH>
<TH align=center><font size="3" face="Arial"><B>Information</B></font></TH>
<TH align=center><font size="3" face="Arial">(Le.) nach</font></TH>
</TR>
<TR><TD align=center><font size="3" face="Arial">24.9.</font></TD>
<TD align=center><font size="3" face="Arial"><B><strike>Dohe</strike></B></font></TD>
<TD align=center><font size="3" face="Arial">Free</font></TD>
<TD align=center><font size="3" face="Arial">1</font></TD>
<TD align=center><font size="3" face="Arial"><strike>Math</strike></font> </TD>
<TD align=center><font size="3" face="Arial">(9)</font> </TD>
<TD align=center><font size="3" face="Arial">---</font> </TD>
<TD align=center><font size="3" face="Arial"><strike>Dohe</strike></font></TD>
<TD align=center><font size="3" face="Arial">A001</font></TD>
<TD align=center>&nbsp;</TD>
<TD align=center>&nbsp;</TD>
<TD align=center><font size="3" face="Arial">Free.</font></TD>
</TR>
</TABLE>

to generate this JSON code:

{"Values":{"1":{"Date":"24.9.","Teacher":"Dohe","?":"Free","Hour":"1","Subject":"Math ","Class":"(9) ","Room":"--- ","(Teacher)":"Dohe","(Room)":"A001","XYY":" ","Information":" ","(Le.) nach":"Free."},"2":{"Date":"26.9.","Teacher":"John","?":"Free","Hour":"8","Subject":"Bio ","Class":"(9) ","Room":"--- ","(Teacher)":"John","(Room)":"A021","XYY":" ","Information":" ","(Le.) nach":"Free."}}}

The script is perfect but I need a script, which saves the JSON data to a file on the server automatically, without any user interaction.

Marius Schönefeld
  • 425
  • 2
  • 6
  • 21
  • Can post it to a PHP page via AJAX and just have the PHP page write it to a file. – Twisty Oct 06 '15 at 20:55
  • and how ? Do you have a link or something ? – Marius Schönefeld Oct 06 '15 at 20:57
  • That depends, what have you tried? Want to use JQuery or raw JavaScript? When the PHP gets the data, do you want it to write it to a txt file or a json file. What do you want the name of the file to be? Are you sending an email after it's created? Storing a link in a database? Need more info about what you want to accomplish. – Twisty Oct 06 '15 at 20:59
  • I want that a program is loading a html Page. This page should automatically put the data from the table to a JSON called subs.json. That all nothing else should happen. But the user shouldn't do any action to make this happen, only load the page. – Marius Schönefeld Oct 06 '15 at 21:04
  • How is the HTML page generated, if it's dynamic, why not have the same data pushed to the file at the same time? If it's static, you have to rely on JavaScript. If the browser does not support JS, is that okay? IF 2 people load the page at the same time, do you want two files generated or should the second overwrite the first? Amend to the first? Still need more info. – Twisty Oct 06 '15 at 21:08
  • The HTML file is daily generated by an program. This creates every day a new File. In this program is a template.html where you can add optional code. Here I added the java script. This file is then uploaded to a server by the user. In the next step the user should press a button in a windows programs which loads up the page in a web view. If there are 2 request at the same time it should overwrite it. – Marius Schönefeld Oct 06 '15 at 21:13
  • I have some ideas on this, but it will take me a bit to write up. Will post an answer later tonight. – Twisty Oct 07 '15 at 00:22

3 Answers3

1

If you say your JS logic is perfect, here is a PHP (ver 5.3+) conversion that uses DOM like your code.

This function loads a html file (you may use curl if it is an url) then convert it and save to a file.

function save_table_to_json ( $in_file, $out_file ) {
    $html = file_get_contents( $in_file );
    file_put_contents( $out_file, convert_table_to_json( $html ) );
}

function convert_table_to_json ( $html ) {
    $document = new DOMDocument();
    $document->loadHTML( $html );

    $obj = [];
    $jsonObj = [];
    $th = $document->getElementsByTagName('th');
    $td = $document->getElementsByTagName('td');
    $thNum = $th->length;
    $arrLength = $td->length;
    $rowIx = 0;

    for ( $i = 0 ; $i < $arrLength ; $i++){
        $head = $th->item( $i%$thNum )->textContent;
        $content = $td->item( $i )->textContent;
        $obj[ $head ] = $content;
        if( ($i+1) % $thNum === 0){ // New row. Slightly modified to keep it correct.
            $jsonObj[++$rowIx] = $obj;
            $obj = [];
        }
    }

    return json_encode([ "Values" => $jsonObj ]);
}

// Example
save_table_to_json( 'table.html', 'data.json' );
Sheepy
  • 17,324
  • 4
  • 45
  • 69
  • Thank you for the answer. This gives me this: `{"Values":[{"Datum":"23.10.","Vertreter":"KUL","Art":"Vertretung","Stunde":"9","Fach":"O5","Klasse(n)":"5Z","Raum":"C111","(Lehrer)":"GAT","(Raum)":"C310","Vertr. von":"\u00a0","Vertretungs-Text":"\u00a0","(Le.) nach":"Klassenfahrt 10 A"},{"Datum":"2.` So it puts the data in an array. But what I need is this: `{ "Values": { "1": { "Head":"data", "Head2":"Data2"},"2"{` also with the id, which should be generated automatically. Could you help me ? – Marius Schönefeld Oct 12 '15 at 10:29
  • @MariusSchönefeld Very observant. It is true, there are languages where it matters. I have re-added the rowIx variable which should get you the result you need :) – Sheepy Oct 12 '15 at 11:06
  • Hey, I have a Problem with the Script now it says in the console Unexpected token `>` at this line `$document->loadHTML( $html );` – Marius Schönefeld Nov 01 '15 at 18:21
  • @MariusSchönefeld Check the HTML that all > and & are properly escaped. Many people write non-conformance HTML (v4) and rely on browsers to interpret the actual intention. (And v5 got feed up with the inconsistencies and define these interpretations.) – Sheepy Nov 02 '15 at 00:50
  • Thanks for the anwser here is my HTML. I dont See a Problem. http://pastebin.com/CSg4Qi7T – Marius Schönefeld Nov 02 '15 at 07:23
  • @MariusSchönefeld I can load the file (PHP 5.6.14) and get the script element. Anyway, how about wrapping the script in html commnet? http://pastebin.com/gQfwWxV0 – Sheepy Nov 02 '15 at 08:01
  • Still this error when I load the Page. SyntaxError: expected expression, got '>' – Marius Schönefeld Nov 02 '15 at 11:54
  • Perhaps you can open a new question with reproducible code? I can run this in phpfiddle: `loadHTML( $html ); var_export( $document->getElementsByTagName( 'script' )->item( 0 ) ); ?>` – Sheepy Nov 02 '15 at 12:01
  • Hey again me, now I have a new question which is based on this code. Could you help me ? http://stackoverflow.com/questions/33745667/convert-a-selected-html-table-to-json – Marius Schönefeld Nov 16 '15 at 22:18
0

Personally, I like using JQuery, yet if you don't know or can't insert it as a SRC, then we should assume to use JavaScript. So we will post the JSON data to our PHP via AJAX when the page is done loading. The PHP will then write this JSON into a new file on the server called subs.json and will be overwritten each time the script runs.

We will start with the JavaScript:

<script>
function collectData() {
    var jsonArr = [];
    var obj = {};
    var rowIx = 0;
    var jsonObj = {};
    var thNum = document.getElementsByTagName('th').length;
    var arrLength = document.getElementsByTagName('td').length;

    for(i = 0; i < arrLength; i++){
        if(i%thNum === 0){
            obj = {};
        }
        var head = document.getElementsByTagName('th')[i%thNum].innerHTML;
        var content = document.getElementsByTagName('td')[i].innerHTML;
        obj[head] = content;
        if(i%thNum === 0){
            jsonObj[++rowIx] = obj;
        }   
    }           
    return jsonObj;
}

function postJSONData(json){
    var xmlhttp = new XMLHttpRequest();
    xmlhttp.open("POST", "/subPost.php");
    xmlhttp.setRequestHeader("Content-Type", "application/application/x-www-form-urlencoded");
    xmlhttp.onreadystatechange = function() {
        if(xmlhttp.readyState == 4 && xmlhttp.status == 200) {
            var return_data = xmlhttp.responseText;
            alert(return_data);
        }
    }
    xmlhttp.send("values="+JSON.stringify(json));
}

postJSONData(collectData());
</script>

At this point, the page should post your JSON to a PHP page called subPost.php located at the same level as the page that is executing this JS. This PHP Will look like:

<?php
if(isset($_POST['values'])){
        $values = $_POST['values'];
        $fp = fopen('subs.json', 'w');
        fwrite($fp, $values);
        fclose($fp);
        echo "Values written to subs.json.\r\n";
} else {
        echo "No Post data received.";
}
?>

I made a working example you can see here: http://www.yrmailfrom.me/projects/testPost/ and the content of http://www.yrmailfrom.me/projects/testPost/subs.json is:

{"1":{"<font face=\"Arial\" size=\"3\">Date</font>":"<font face=\"Arial\" size=\"3\">24.9.</font>","<font face=\"Arial\" size=\"3\"><b>Teacher</b></font>":"<font face=\"Arial\" size=\"3\"><b><strike>Dohe</strike></b></font>","<font face=\"Arial\" size=\"3\">?</font>":"<font face=\"Arial\" size=\"3\">Free</font>","<font face=\"Arial\" size=\"3\">Hour</font>":"<font face=\"Arial\" size=\"3\">1</font>","<font face=\"Arial\" size=\"3\">Subject</font>":"<font face=\"Arial\" size=\"3\"><strike>Math</strike></font> ","<font face=\"Arial\" size=\"3\">Class</font>":"<font face=\"Arial\" size=\"3\">(9)</font> ","<font face=\"Arial\" size=\"3\">Room</font>":"<font face=\"Arial\" size=\"3\">---</font> ","<font face=\"Arial\" size=\"3\">(Teacher)</font>":"<font face=\"Arial\" size=\"3\"><strike>Dohe</strike></font>","<font face=\"Arial\" size=\"3\">(Room)</font>":"<font face=\"Arial\" size=\"3\">A001</font>","<font face=\"Arial\" size=\"3\">XYY</font>":"

This is not valid JSON. It seems that some data is being misunderstood. I suspect that this is due to characters in the values, like &nbsp;. I see this in the posted data:

"<font face=\"Arial\" size=\"3\">XYY</font>":"
    [nbsp;","<font_face] => \"Arial\" size=\"3\">(Le.) nach</font>":"<font face=\"Arial\" size=\"3\"
>Free.</font>"}}

I was able to overcome this with another small JS function:

function nbsp2space(str) {
    return String(str).replace(/&nbsp;/g, ' ');
}

Then use this function in collectData() like so:

obj[head] = nbsp2space(content);

Now when the page executes, we post the data to the PHP and it's written to the file subs.json.

Twisty
  • 30,304
  • 2
  • 26
  • 45
0

You can try some thing like this:

HTML

<table>
<tr>
<th>No</th>
<th>Name</th>
<th>Email</th>
</tr>
<tr>
<td>1</td>
<td>Test</td>
<td>test@example.com</td>
</tr>
<tr>
<td>2</td>
<td>Test 2</td>
<td>test2@example.com</td>
</tr>
<tr>
<td>3</td>
<td>Test 3</td>
<td>test3@example.com</td>
</tr>
</table>

Javascript

   <script type="text/javascript">
        jQuery(document).ready(function(){
          data = new Array();
          columns = [];
          var row = new Array();
          $('table tr').each(function(index,tr){
              var index = index;
              if(index == 0){ // First we get column names from th.

                $(tr).find('th').each(function(thIndex,thValue){
                  columns.push($(thValue).text());
                });
              } else {
                $(tr).find('td').each(function(tdIndex,tdValue){
                  row[tdIndex] = $(tdValue).text(); // Put each td value in row
                });

                data.push(row); // now push each row in data.
                row = new Array(); // reset row after push
              }

          });
        // Send it to PHP for further work:
          $.post('json.php', { data: data, columns: columns }, function(response){
          // TODO with response
          });
        })
        </script>

json.php

$data = $_POST['data']; // Each rows values
$columns = $_POST['columns']; // Columns names

for($i = 0; $i < count($data); $i++) {

  $json[] = array(($i+1) => array_combine($columns, $data[$i]));

}

$json = json_encode($json);
// TODO with $json eg: file_put_contents();

the output you will get after json_encode() is:

{"values":[{"1":{"No":"1","Name":"Test","Email":"test@example.com"}},{"2":{"No":"2","Name":"Test 2","Email":"test2@example.com"}},{"3":{"No":"3","Name":"Test 3","Email":"test3@example.com"}}]}

Note jQuery must be included before running this.

Touqeer Shafi
  • 5,084
  • 3
  • 28
  • 45