getAttribute()
getAttribute()
gets the value of the given attribute of the element. getAttribute()
will return the current value, even if the attribute has been modified after the page has been loaded. This method will return the value of the property with the given name, if it exists. If it does not, then the value of the attribute with the given name is returned. If neither exists, null is returned.
innerHTML
innerHTML
property sets or gets the HTML syntax describing the element's descendants.
An Example :
WebElement content = element.innerHTML;
content will contain the serialized HTML code describing all of the element's descendants.
So when your HTML is :
<td>Google</td>
If you mention :
String tdValue = rowDataElement.getAttribute("innerHTML");
The output is Google (in plain text) as the <td>
tag only had innerText but no descendants.
But when your HTML is :
<td><font size="4" face="Arial"><i>Google</i></font></td>
In a more simplified way the HTML DOM would look :
<td>
<font size="4" face="Arial">
<i>Google</i>
</font>
</td>
Now if you mention :
String tdValue = rowDataElement.getAttribute("innerHTML");
As per the documentation of innerHTML
the serialized HTML code describing all of the element's descendants are extracted. Additionally, if a <div>
or <span>
node has a child text node that includes the characters (&)
, (<)
, or (>)
; innerHTML returns these characters as &
, <
and >
respectively.
Hence you get the full element string.
Solution
Use Node.getText
or Node.textContent
to get a correct copy of these text nodes' contents.