8

I'm trying to get the contents of a textarea in an HTML form using webdriver in Python.

I'm getting the text, but newlines are missing. The selenium docs are pretty much useless; they say:

class selenium.webdriver.remote.webelement.WebElement(parent, id_)

[...]

text: Gets the text of the element.

I'm currently doing the following:

from selenium import webdriver

# open the browser and web site
b = webdriver.Firefox()
b.get('http://www.example.com')

# get the textarea element
textbox = b.find_element_by_name('textbox')

# print the contents of the textarea
print(repr(textbox.text))

This prints the representation of a Python unicode string of the textarea's contents, except all the newlines have been replaced by spaces. Doh!

Not sure if I'm facing a text encoding issue, selenium/webdriver bug (couldn't find it in the tracker), or user error.

Is there a different way to do this?

EDIT: I just gave Chrome a try... works fine. I reported a bug to selenium's issue tracker. Sam's workaround (the accepted answer below) works in Firefox with one caveat: symbols are converted to HTML entity codes in the returned string. This is no big deal.

Community
  • 1
  • 1
Steven T. Snyder
  • 5,847
  • 4
  • 27
  • 58

3 Answers3

7

I have just got attribute value of tag textarea. Below is a sample of Java code.

WebElement textarea = driver.findElement(By.id("xf-1242"));         
String text = textarea.getAttribute("value");
log.debut(text);

I am using Chrome driver, and above code put a text (XML in my case) with newlines in the log. I got the idea from http://www.w3schools.com/jsref/dom_obj_textarea.asp

Jan

Jan Sabak
  • 71
  • 1
  • 1
4

As a workaround you can try using ExecuteScript to get the innerHtml. I am not a python guy, but here it is in C#:

IWebElement element = ...
String returnText = ((IJavaScriptExecutor)webDriver).ExecuteScript("return arguments[0].innerHTML", element).ToString();
Corey Sunwold
  • 10,194
  • 6
  • 51
  • 55
Sam Woods
  • 1,820
  • 14
  • 13
  • Great suggestion. This works almost perfectly. The only issue is that symbols are translated into their HTML entity codes in the resulting string. i.e. `<` becomes `<`, and so on. I cast a few low-level Python spells and mitigated that issue. – Steven T. Snyder Nov 28 '11 at 23:40
  • 4
    For reference, the equivalent Python code is `text = my_web_driver.execute_script("return arguments[0].innerHTML", textarea_element)`. – Steven T. Snyder Nov 28 '11 at 23:41
  • 1
    In Python, `xml.sax.saxutils.unescape(text)` un-escapes the `&`, `<` and `>` entities. – Steven T. Snyder Nov 29 '11 at 00:08
0

In Python get the element first, and after get the attribute value, function in python get_attribute('value').

from selenium import webdriver

driver = webdriver.Firefox()
URL = "http://www.w3schools.com/tags/tryit.asp?filename=tryhtml_textarea"
driver.get(URL)
driver.switch_to.frame("iframeResult")
# get the textarea element by tag name
textarea = driver.find_element_by_tag_name('textarea')

# print the attribute of the textarea
print(textarea.get_attribute('value'))
print(textarea.get_attribute('rows'))
print(textarea.get_attribute('cols'))