0

in my table HTML I could get rd tag in both ways:

1.

<td><font size="4" face="Arial"><i>Google</i></font></td>

2.

<td>Google</td>

I am using:

String tdValue = rowDataElement.getAttribute("innerHTML");

Now when 'td' is defined as it is in 2nd option I get correct string as "Google", but when it has additional elements I get full element string.

Is there a way to always get inner string for an element?

user987316
  • 894
  • 4
  • 13
  • 35

1 Answers1

0

getAttribute()

getAttribute() gets the value of the given attribute of the element. getAttribute() will return the current value, even if the attribute has been modified after the page has been loaded. This method will return the value of the property with the given name, if it exists. If it does not, then the value of the attribute with the given name is returned. If neither exists, null is returned.

innerHTML

innerHTML property sets or gets the HTML syntax describing the element's descendants.

An Example :

WebElement content = element.innerHTML;

content will contain the serialized HTML code describing all of the element's descendants.

So when your HTML is :

<td>Google</td>

If you mention :

String tdValue = rowDataElement.getAttribute("innerHTML");

The output is Google (in plain text) as the <td> tag only had innerText but no descendants.

But when your HTML is :

<td><font size="4" face="Arial"><i>Google</i></font></td>

In a more simplified way the HTML DOM would look :

<td>
    <font size="4" face="Arial">
        <i>Google</i>
    </font>
</td>

Now if you mention :

String tdValue = rowDataElement.getAttribute("innerHTML");

As per the documentation of innerHTML the serialized HTML code describing all of the element's descendants are extracted. Additionally, if a <div> or <span> node has a child text node that includes the characters (&), (<), or (>); innerHTML returns these characters as &amp, &lt and &gt respectively.

Hence you get the full element string.

Solution

Use Node.getText or Node.textContent to get a correct copy of these text nodes' contents.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352