0

I am trying to match text, and output the entire row including self in xpath.

The issue I am having is the self node also contains javascript in the html table and it is outputing the script as well.

I have tried the following:

Working but contains javascript from the self node:

$bo_row = $bo_xpath->query( "//td[contains(text(),'1234')]/following-sibling::* | //td[contains(text(),'1234')] " );

Failed attempts all look similar to:

$bo_row = $bo_xpath->query( "//td[contains(text(),'1234')]/following-sibling::* | //td[contains(text(),'1234')]//*[not(self::script)] " );

Here is an example of one table row:

<tr>
                        <!-- <td><a class=info href="**Missing Data**">
                                <img src="../images/button_go.gif" border=0>
                                <span>**Missing Data**</span>
                                </a>
                        </td>  -->
                        <script>
                  if (document.getElementById("Function").value != 'Customer')
                            document.write('<td><a class=info href="OrdDetLine.pgm?Order=CV780&Page=02&Line=05&Seq=00&ShowPrice=&OpenOnly=&Function=Customer"><img src="../images/button_go.gif" border=0><span>Order Line Detail</span></a></td>');</script>

            <td align="left">2-05-00</td>
            <td align="left">        1234
            <script>if (document.getElementById("Function").value != 'Customer')
                    document.write("<a class=info href=#><img src=/operations/images/eye.png border=none onClick=window.open(\'StyleHdr.pgm?CompDiv=CO&Style=1234\'><span>Show style master information panel.</span></a>") ;     </script>
            </td>
            <td align="left">MEN'S LAB/SHOP COATS</td>
            <td align="left">REG</td>
            <td align="left">NAY</td>

                        <td align="right">1</td>

            <td align="right">April 12, 2019</td>

</tr>

I have tried using getAttribute to select the innertext like so:

$bo_row = $bo_xpath->query( "//tr/td[contains(text(),'1234')]/following-sibling::* | //td[contains(text(),'1234')] " );

echo '<br/>';
        if ( $bo_row->length > 0 ) {

            foreach ( $bo_row as $row ) {
                echo $row->getAttribute ('innerText');  

            }

However I am either using getAttribute incorrectly or it is not supposed by php as indicated by PHPstorm

Sackling
  • 1,780
  • 5
  • 37
  • 71
  • Can you include a basic HTML which can be used for testing? – Nigel Ren Mar 09 '19 at 15:42
  • I added one table row I think it should be enough of an idea – Sackling Mar 09 '19 at 15:49
  • Looks more like the way your outputting the node more than the XPath, have a look as https://stackoverflow.com/questions/15703137/get-the-text-content-of-a-node-but-ignore-child-nodes (the second answer looks best). – Nigel Ren Mar 09 '19 at 16:08
  • if you are planning to get the text in the `tr` without script, then you have to get the innerText. Here is the output in console. "2-05-00 1234 MEN'S LAB/SHOP COATS REG NAY 1 April 12, 2019" – supputuri Mar 09 '19 at 16:13
  • @supputuri what do you mean by get the innerText isnt that exactly what I'm selecting? – Sackling Mar 10 '19 at 13:45

1 Answers1

0

You have to use getAttribue('innerText'). Here is the console output with 2 different approaches. enter image description here

supputuri
  • 13,644
  • 2
  • 21
  • 39
  • I tried using getAttribute (updated questio) but it is not working for me and phpstorm shows that it is not a recognized method in php – Sackling Mar 11 '19 at 03:16
  • can you execute the javascript from phpstrom and return the value? Just a thought. – supputuri Mar 11 '19 at 03:23
  • As per [PHP: DOMElement::getAttribute](http://php.net/manual/en/domelement.getattribute.php) getAttribute should work in php. – supputuri Mar 11 '19 at 03:31
  • I saw that in the manual as well. I am thinking that maybe it is the "innerText" that is not recognized? – Sackling Mar 11 '19 at 03:58
  • I have. I have tried using double quotes as well, I have tried using different attributes in the get attributes function as well and I cannot see to get it to pickup anything. I am trying to copy other examples I have found using getAttribute and checked to see if it is an array but it doesnt appear to be so. I really am at a loss – Sackling Mar 11 '19 at 19:53
  • When I tried changing the query to the entire file, and changing getattribute to href, it does pickup results. so getAttribute definitely does work, however I don't think it works to pickup innertext – Sackling Mar 11 '19 at 20:17
  • hmm, interesting. I think you already tried the getAttribute with "innerText" and "innertext". What about "innerHTML"? – supputuri Mar 11 '19 at 20:35
  • Yep I tried both the upper and lower case T neither worked and neither is the innerHTML I think I am going to take a different approach and just trim the string with php to remove the script – Sackling Mar 12 '19 at 14:19
  • I actually found this answer which did the trick: https://stackoverflow.com/questions/7130867/remove-script-tag-from-html-content – Sackling Mar 12 '19 at 14:33