-1

I have an HTML file that contains many <tr> tags such as

       <tr>

            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                aaa
            </td>
            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                bbb                                
            </td>
             <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                ccc
            </td>
            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                ddd  
            </td>

            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                eee
            </td>
        </tr>
        <tr>

            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                xxx
            </td>
            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                vvv                                
            </td>
             <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                bbb
            </td>
            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                nnn  
            </td>

            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
                hhh
            </td>
        </tr>

I want to make a database from this values(aaa,bbb,ccc.....).
How can I separate this tags and select right values?
I want to use php for this selection.

Qirel
  • 25,449
  • 7
  • 45
  • 62
mrmrn
  • 65
  • 4
  • 13
  • you could parse the HTML and then generate queries based on the values you have parsed. – user1336827 Jan 11 '17 at 22:47
  • Run some JavaScript in your browser's console to walk the table one row at a time and grab its cells. Toss the results into an array and console.log that array. Copy the array then run it on the backend where you can insert it into the database. – Ultimater Jan 11 '17 at 22:49
  • http://stackoverflow.com/questions/1403087/how-can-i-convert-an-html-table-to-csv This allows you to convert to CSV then it is easy to import to a database or write a php file to access the CSV – Patrick Murphy Jan 11 '17 at 22:51
  • I tried libreoffice calc to convert to CSV and in my case it does not work. about JS,I can copy and paste items faster than writing a js code and then copy the values one by one. @user1336827: how to parse the html? – mrmrn Jan 11 '17 at 23:05
  • @PatrickMurphy: at first I tested source of page with libreoffice calc and it did not respond! but with the pure html file it can separate the data. thanks.however I liked to do some php code by using some regex or.. to sole the problem! – mrmrn Jan 11 '17 at 23:13

3 Answers3

0

If the file is proper XML, you can use XPath to iterate trough the elements.

$content = <<<EOT
<html>
    <tr>

            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    aaa
            </td>
            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    bbb
            </td>
             <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    ccc
            </td>
            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    ddd
            </td>

            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    eee
            </td>
        </tr>
        <tr>

            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    xxx
            </td>
            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    vvv
            </td>
             <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    bbb
            </td>
            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    nnn
            </td>

            <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
    hhh
            </td>
        </tr>
</html>
EOT;
$xml = new SimpleXmlElement($content);
$result = $xml->xpath("//td");
$values = array();
foreach($result as $node) {
    $values[] = trim((string)$node);
}
var_dump($values);

After extracting the data, you can use mysqli_connect to connect to the database and mysqli_query run a query to insert the data into a table.

Jeff
  • 452
  • 4
  • 9
  • Unfortunately it is not an xml. If it was some kind of xml parser could help me. I copied the source of html file in my question and it is a normal html page. – mrmrn Jan 12 '17 at 00:35
  • @mrmrn, have you tried the proposed code before commenting? – Ruslan Abuzant Jan 12 '17 at 01:09
  • Most (X)HTML files can be parsed as XML. The code is perfectly working with your sample. – Jeff Jan 12 '17 at 12:31
  • @RuslanAbuzant, as I said above I used http://stackoverflow.com/questions/1403087/how-can-i-convert-an-html-table-to-csv as mentioned in one of my comment bellow the question. but honestly I didn`t tried the code. I will test your code. thank you very much bro. – mrmrn Jan 12 '17 at 22:10
  • hhhh @mrmrn, I did not even mean my answer or my code. I was referring to @Jeff's code which shall work perfectly as well because he correctly used `$xml->xpath("//td");` then you said I am not using XML, which instantly indicated you did not even try his code. Never mind, good luck with your project anyways. – Ruslan Abuzant Jan 13 '17 at 00:00
0

This code assumes the HTML in your question is exactly the one you want to extract data from, so I use the line indention and newlines to extract the data as follows:

    $content = <<<EOT
    <html>
        <tr>

                <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        aaa
                </td>
                <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        bbb
                </td>
                 <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        ccc
                </td>
                <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        ddd
                </td>

                <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        eee
                </td>
            </tr>
            <tr>

                <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        xxx
                </td>
                <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        vvv
                </td>
                 <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        bbb
                </td>
                <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        nnn
                </td>

                <td class="parsehlisttable_alteritemstyle" style="text-align: right;">
        hhh
                </td>
            </tr>
    </html>
    EOT;


$lines = explode("\n", $content);
foreach($lines as $line)
{
  if( trim($line) == trim(strip_tags($line)) && $line != '' )
  {
     $line = trim($line);
     $mydata[] = $line;
  }
}

foreach($mydata as $data)
{
   mysql_query("INSERT INTO .... VALUES (NULL, '" . $data . "' ) ");
}

Good luck

Ruslan Abuzant
  • 631
  • 6
  • 17
0

at first,I converted the html page to an xls file,then converted it to a CSV file by using libreoffice calc.

then I impoeted the CSV to a mysql table. but this table was not as good as I needed. so I used some php code and read the database and re write it on a new table. Now I have a clean and useful DB from the HTML file.

mrmrn
  • 65
  • 4
  • 13
  • you should at least share your php code, if you mark this answer as the accepted answer. i don't see how anyone could actually verify if your answer is right or wrong with this vage description... – Jeff Jan 18 '17 at 21:56