This problem is driving me insane: I'm trying to capture the response from a Pandorabot using Selenium but although I can input text and make the bot reply, its webpage is formatted in such a way that makes selecting the output text a nightmare.
This is my code in Python:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from time import sleep
driver = webdriver.Firefox()
driver.get("http://demo.vhost.pandorabots.com/pandora/talk?botid=b0dafd24ee35a477")
elem = driver.find_element_by_name("input")
elem.clear()
elem.send_keys("hello")
elem.send_keys(Keys.RETURN)
line = driver.find_element_by_xpath("(//input)[@name='botcust2']/preceding::font[1]/*")
print(line)
response = line.text
print(response)
driver.close()
which manages to get the first bit of the response ("Chomsky:") but not the rest.
How do I get to properly capture the response text (ideally excluding the bot name)? Is there a more elegant way to do it (eg jquery script) that wouldn't break so easily if the webpage gets reformatted?
Many thanks!
Edit
So, after playing around a bit more with jQuery I found a workaround to the problem of any URL text not showing.
I set the whole text string into a variable and then I replace any instances of the name and empty lines with ''. So the jQuery code as pointed out by pguardiario becomes:
# get the last child text node
response = self.browser.execute_script("""
var main_str = $('font:has(b:contains("Chomsky:"))').contents().has( "br" ).last().text().trim();
main_str = main_str.replace(/Chomsky:/g,'').replace(/^\\s*[\\r\\n]/gm, '');
return main_str;
""")
I'm sure there may be better/more elegant ways to do the whole thing but for now it works.
Many thanks to pguardiario and everyone else for the suggestions!