1

I'm trying to scrape the promotion information of each product from a website by clicking on the product and go to its detailed page. When the spider clicks on the product, the web will ask it to log in, and I tried the following code:

    def __init__(self):
        self.driver = webdriver.Chrome(executable_path = '/usr/bin/chromedriver')
...
    def start_scraping(self, response):
        self.driver.get(response.url)    
        self.driver.find_element_by_id('fm-login-id').send_keys('iamgooglepenn')
        self.driver.find_element_by_id('fm-login-password').send_keys('HelloWorld1_')
        self.driver.find_element_by_class_name('fm-button fm-submit password-login').click()
        ...

However, there is NoSuchElementException when I run it.

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="fm-login-id"]"}
'spider_exceptions/NoSuchElementException': 14,

The HTML of the login page is as follows:

<div class='input-plain-wrap input-wrap-loginid'>
    <input id='fm-login-id' class='fm-text' name='fm-login-id'...>
    event
</div>

So, I'm pretty sure the id should be 'fm-login-id'. The reason I could think of that might cause this issue is that this login page is a popup. login page

Basically, it pops up in the middle of the main page. Looking at the HTML of the site, I can see that the login type seems to be a new HTML window

<!DOCTYPE html>
<html>event
....
<\html>

I'm not sure if this is the issue, and if so, how to fix it? Also, is there other reasons that might've caused the issue?

Tianhe Xie
  • 261
  • 1
  • 10
  • When popup appearing you should handle windows in selenium. then you can do actions on that child window https://www.toolsqa.com/selenium-webdriver/switch-commands/ follow this link – Justin Lambert Jul 15 '20 at 05:14

6 Answers6

2

The popup will have an ID. You might have to add f'#{popup_id}' to the end of response.url. Like this URL: https://stackoverflow.com/questions/62906380/nosuchelementexception-when-using-selenium-python/62906409#62906409. It contains #62906409 because 62906409 is the ID of an element in the page.

Pyzard
  • 451
  • 3
  • 14
  • `https://detail.tmall.com/item.htm?spm=a220m.1000858.1000725.6.2d375757n6RzBN&id=35550626143&skuId=4372921049418&user_id=1669409267&cat_id=2&is_b=1&rn=e69905ae0d51cc9e426ec93a52d95bcc` This is the link I copied from browser, what would be the ID for the popup? – Tianhe Xie Jul 15 '20 at 02:07
  • You have to inspect the popup and find it's ID. – Pyzard Jul 15 '20 at 02:08
  • Something like `https://detail.tmall.com/item.htm?spm=a220m.1000858.1000725.6.2d375757n6RzBN&id=35550626143&skuId=4372921049418&user_id=1669409267&cat_id=2&is_b=1&rn=e69905ae0d51cc9e426ec93a52d95bcc#login-form` might work. I just inspected the page. The ID of the **form** is `login-form`. Also, it looks like the popup doesn't _have_ an ID. – Pyzard Jul 15 '20 at 02:11
  • I'm sure the popup is what's causing the error. For some reason, Selenium doesn't count the popup as an element in the **_`page`_**. – Pyzard Jul 15 '20 at 02:15
  • So, the e6605ae0d51.. part would be the form id? Should I do something like self.driver.get(response.url#{e6605ae0d51.})? – Tianhe Xie Jul 15 '20 at 02:19
  • After thoroughly inspecting the page, I think the ID of the popup might be `container` because it says ` – Pyzard Jul 15 '20 at 02:24
  • Try `https://detail.tmall.com/item.htm?spm=a220m.1000858.1000725.6.2d375757n6RzBN&id=35550626143&skuId=4372921049418&user_id=1669409267&cat_id=2&is_b=1&rn=e69905ae0d51cc9e426ec93a52d95bcc#container#login-form`. – Pyzard Jul 15 '20 at 02:30
  • The form ID is `login-form`. – Pyzard Jul 15 '20 at 02:31
1

The login content seems to be nested in an iFrame element (if you trace it all the way to the top, you should find an iFrame with id="sufei-dialog-content"), which means you need to switch to that iFrame for that nested html before selecting your desired element, otherwise it will not work.

First you will need to use driver.switch_to.frame("sufei-dialog-content"), and then select your element with driver.find_element_by_name() or whatever you had.

A similar issue can be found here: Selenium and iframe in html

jasonyux
  • 93
  • 1
  • 1
  • 7
1

The login page inside a frame, you need switch it first:

#switch it first
self.driver.switch_to.frame(driver.find_element_by_id('J_loginIframe'))
self.driver.find_element_by_id('fm-login-id').send_keys('iamgooglepenn')
self.driver.find_element_by_id('fm-login-password').send_keys('HelloWorld1_')

And for login button you can't use .find_element_by_class_name, this method just for single class name. This element having multiple class name, so use .find_element_by_css_selector like bellow:

#submit button
self.driver.find_element_by_css_selector('.fm-button.fm-submit.password-login').click()
frianH
  • 7,295
  • 6
  • 20
  • 45
  • Now it's saying `selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="J_loginIframe"]"}` Where did you locate the id of the frame? – Tianhe Xie Jul 15 '20 at 05:26
  • @TianheXie Based on comment in answer other user, actually I thinking this issue refer to this link login page : `https://login.tmall.com/?spm=a220o.7142085.a2226mz.1.6c997e96viyscY&redirectURL=https%3A%2F%2Fdetail.tmall.com%2Fitem.htm%3F`. Is possible to share the real url? – frianH Jul 15 '20 at 05:33
  • Not sure what you meant by real URL? `https://detail.tmall.com/item.htm?spm=a220m.1000858.1000725.6.2d375757n6RzBN&id=35550626143&skuId=4372921049418&user_id=1669409267&cat_id=2&is_b=1&rn=e69905ae0d51cc9e426ec93a52d95bcc` is the URL I copied from the browser when the log in page poped up – Tianhe Xie Jul 15 '20 at 05:35
  • The starting URL for the spider is `'https://list.tmall.com/search_product.htm?q=iPad'` Then the spider will click on each individual product fore more detailed information. Once the spider clicks, the login page will pop up, which is the url I copied above – Tianhe Xie Jul 15 '20 at 05:37
0

Just a simple mistake:

<div class='input-plain-wrap input-wrap-loginid'>
    <input id='fm-login-id class='fm-text' name='fm-login-id'...>
    event
</div>

is actually supposed to be:

<div class='input-plain-wrap input-wrap-loginid'>
    <input id='fm-login-id' class='fm-text' name='fm-login-id'...>
    event
</div>

You forgot a single-quote.

Pyzard
  • 451
  • 3
  • 14
0

Have you tried driver.find_element_by_name('fm-login-id')?

jasonyux
  • 93
  • 1
  • 1
  • 7
  • Yes, and it's still NoSuchElementException – Tianhe Xie Jul 15 '20 at 01:59
  • Can you look at the page again? To me it seems that the pop up is created in an `iFrame` element with id `id="sufei-dialog-content"`. If so you might need to switch to that with `driver.switch_to.iframe(self,frame reference)` before trying to get the element. – jasonyux Jul 15 '20 at 02:22
0

You should try finding the elements by their XPaths. You just have to inspect the element, right-click on it and copy its XPath. The XPath of the first <input ... is //*[@id="fm-login-id"].

Pyzard
  • 451
  • 3
  • 14