4

So I am trying to use selenium to automate some form completion, but I'm running up against an issue. One of the forms I am using is not loaded straight away by the HTML, but it is loaded using JavaScript after the page has loaded normally. For whatever reason, selenium isn’t able to see the updated source of the page after it loads in the javascript. For example, if I run the following code.

browser = webdriver.Firefox()
browser.get('https://examplepage.com')

WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.ID, “13jres”))).send_keys(“email@email.com”)

Nothing happens and it times out. After doing some testing I've noticed that if I print the source code in python, using the following code

browser = webdriver.Firefox()
browser.get('https://examplepage.com')
time.sleep(20)
print browser.page_source

Then the source code is different than the source code which I can view manually in the selenium firefox instance. So the following line, which is what I am trying to put an input into, isn't there according to the selenium source output, even though its obviously in there when inspecting element in firefox or viewing the source in Firefox instance of what was loaded using selenium.

<input label=“Email” type="text" name="13jres" id="13jres" class="text-field”>(shortened to make it more readable)

Reading through some docs I found this tidbit when referencing the page_source command, which I guess explains the difference in the sources, but I am still unclear on how to alleviate my issue with finding these elements on the page. I've tried other browsers in selenium(safari, chrome, etc) but besides that, I'm not really sure what I need to do.

“If the page has been modified after loading (for example, by Javascript) there is no guarantee that the returned text is that of the modified page. Please consult the documentation of the particular driver being used to determine whether the returned text reflects the current state of the page or the text last sent by the web server.”

Qwerty
  • 1,252
  • 1
  • 11
  • 23
Bigandrewgold
  • 835
  • 3
  • 12
  • 19

3 Answers3

2

As you have mentioned Nothing happens and it times out. which essentially means that it can be either of the following cases :

  • <input> tag : As per the shortened HTML you have provided :

    <input label=“Email” type="text" name="13jres" id="13jres" class="text-field”>(shortened to make it more readable)
    

    As a result of shortening the markup we are unable to understand if the <input> tag have any onClick() event associated with it or not.

    Next as you are attempting to :

    WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.ID, “13jres”))).send_keys(“email@email.com”)
    

    It remains inconclusive whether we are invoking the send_keys() on the right webelement or not.

  • Locator Strategy : As per your code trial you have tried to use the Locator Strategy based on the id. But the id attribute being set to value 13jres looks dynamic to me. Hence, you can be more granular and adapt a more effective Locator Strategy as below :

    WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, “input.text-field[id$='jres']”))).send_keys(“email@email.com”)
    
  • You can find a detailed discussion on Locator Strategy in Official locator strategies for the webdriver

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
2

Automating with selenium based on the source code from page_source might be bad practice because there are two major cases, and they happen often, where the code behind the live page is different from the initial webpage source page:

1 .

page_source displays the source page, but the source page although is practically the original seed page of the DOM, the DOM can change and it is dynamically changed by JS code sometime dramatically. In this case est practice would be:

browser.get("url")
sleep(experimental) # usually get will finish only after the page is loaded but sometimes there is some JS woo running after on load time
  
try:
    element= WebDriverWait(browser, delay).until(EC.presence_of_element_located((By.ID, 'your_id_of_interest')))
    print "element is ready do the thing!"
except TimeoutException:
    print "Somethings wrong!"   

2 .

page_source doesn't display shadow DOMS if you element happens to see in shadow DOMS it will not be visible in page_source, browser, or document object in JavaScript you need to first expand the shadow-DOM

def expand_shadow_element(element):
  shadow_root = driver.execute_script('return arguments[0].shadowRoot', element)
  return shadow_root

outer = expand_shadow_element(driver.find_element_by_css_selector("#test_button"))
inner = outer.find_element_by_id("inner_button")
inner.click()

The problems comes when you have shadow roots within shadow root to see more details see this answer:Accessing Shadow DOM tree with Selenium

You can see also this answer I gave in case you want to see how to get the source code of dynamical content: https://stackoverflow.com/a/48782708/1577343

Community
  • 1
  • 1
Eduard Florinescu
  • 16,747
  • 28
  • 113
  • 179
0

Try waiting till the page loads completely then perform the action. I am not use in python but in javascriptexecutor there is an option

bool wait = new WebDriverWait(driver, TimeSpan.FromSeconds(60)).Until(d => ((javascriptexecutor)d).executescript("return document.readyState").Equals("complete")); 

if(wait == true)
{
    //Your code
}

Above syntax might change for python

Above code will wait for page to load for 60 seconds and return true if page is ready(within 60 seconds), false if page is not ready (after 60 seconds).

Vincent
  • 484
  • 4
  • 6
  • 21