2

This the DOM of page,

<html>
    <head>
    <body>
        <div id="Content">
        Take/O a/O look/O at/O the/O section/O about/O filling/O in/O forms/O   using/O
        <div id="Footer">
    </body>
</html>

I want to access text that is not under any tag just body after <div id="Content"> and before <div id="Footer"> in page body.

I tried:

  1. drv.findElement(( By.xpath("//html/body"))).getText(); but this will give me full text in page under body tag.

  2. drv.findElement(( By.xpath("//html/body/data"))) // error Unable to locate element

now can I use following preceding xpath option, as I doubt that this will also look for tag in page?

Andrea
  • 11,801
  • 17
  • 65
  • 72
SDV
  • 313
  • 2
  • 9

2 Answers2

1

From your wording, I assume that you actually mean to say that this is your html code, with closed head and div tags:

<html>
    <head></head>
    <body>
        <div id="Content"></div>
        Take/O a/O look/O at/O the/O section/O about/O filling/O in/O forms/O   using/O
        <div id="Footer"></div>
    </body>
</html>

In that case, the answer of this question, is what you are looking for: How to get text of an element in Selenium WebDriver (via the Python api) without including child element text?

Community
  • 1
  • 1
jumps4fun
  • 3,994
  • 10
  • 50
  • 96
  • I was in doubt if I should have flagged this as a duplicate, but since the other question is specific to python, I chose not to. Even if the answer I linked to is in no way python-specific. – jumps4fun Jul 23 '15 at 11:58
  • Yes @KjetilNordin, but I am not able to inject java script - jQuery over publicly hosted service [link](http://nlp.stanford.edu:8080/ner/process) pls try it and suggest how to get text after processing. – SDV Jul 24 '15 at 13:33
  • I'm afraid I do not understand. – jumps4fun Jul 24 '15 at 13:53
1

This is a crude solution using Java Strings.

// get the page source 
String pageSource = driver.getPageSource();

// split the pafe source into 2. temp[0] will contain the page source
// before <div id="Content"> and temp[1] will contain page source after 
String[] temp1 = pageSource.split("<div id=\"Content\">");

// get the required text by splitting the temp1[1]
String[] temp2 = temp1[1].split("<div id=\"Footer\">");

// required text will be contained in the temp2[0]
String requiredText = temp2[0];

This solution is not complete. I cannot provide the accurate code without seeing your entire DOM. But I think you get the idea.

StrikerVillain
  • 3,719
  • 2
  • 24
  • 41
  • Thanks @lost your solution worked, but still looking for perfect solution to the problem. – SDV Jul 27 '15 at 13:43
  • @SDV, I am not sure there is any direct solution in Selenium which can handle your issue as your scenario is unique. – StrikerVillain Jul 28 '15 at 14:51