-1

In the below URL i need to click a mail icon hyperlink, sometimes it is not working even code is correct, in this case driver needs to wait upto 10 seconds and go to the next level

https://www.sciencedirect.com/science/article/pii/S1001841718305011

         tags = driver.find_elements_by_xpath('//a[@class="author size-m workspace-trigger"]//*[local-name()="svg"]')
         if tags:
             for tag in tags:
                 tag.click()

how to use explicitly or implicitly wait here-- "tag.click()"

  • Do you mean to click on the two links with the two mail icons adjacent this two names WeibingZhang, JunhongQian @scoop realm? – SIM Jan 05 '19 at 16:42

5 Answers5

0

As an aside.. you can extract the author contact e-mails (which are same as for click) from json like string in one of the scripts

from selenium import webdriver
import json
d = webdriver.Chrome()
d.get('https://www.sciencedirect.com/science/article/pii/S1001841718305011#!')
script = d.find_element_by_css_selector('script[data-iso-key]').get_attribute('innerHTML')
script = script.replace(':false',':"false"').replace(':true',':"true"')
data = json.loads(script)
authors = data['authors']['content'][0]['$$']
emails = [author['$$'][3]['$']['href'].replace('mailto:','') for author in authors if len(author['$$']) == 4]
print(emails)
d.quit()

You can also use requests to get all the recommendations info

import requests
headers = {
    'User-Agent' : 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36'
          }
data = requests.get('https://www.sciencedirect.com/sdfe/arp/pii/S1001841718305011/recommendations?creditCardPurchaseAllowed=true&preventTransactionalAccess=false&preventDocumentDelivery=true', headers = headers).json()   
print(data)

Sample view:

QHarr
  • 83,427
  • 12
  • 54
  • 101
  • sir this is fine but along with mails i need author name, article title, journal name, year and issue number in a single line, if there are 10 authors i need to print all above details, for that i wrote code and every thing is going fine, but some times it is failing to perform tag.click() in search result. to continue the loop when it is failed it should come back. search results are here https://www.sciencedirect.com/search?qs=microviscosity&show=25&sortBy=relevance – scoop realm Jan 05 '19 at 14:52
  • Is the journal name considered to be Chinese Chemical Letters? And what is the issue number please? – QHarr Jan 05 '19 at 15:17
  • for single url it is ok, for each keyword there are around 6000 results we need to process, go through this link "https://www.sciencedirect.com/search?qs=microviscosity&show=25&sortBy=relevance", for each link we need to parse and again extract required details from it, is it becoming double work? – scoop realm Jan 05 '19 at 15:42
0

You have to wait until the element is clickable . You can do it with WebDriverWait function.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Firefox()
driver.get('url')

elements = driver.find_elements_by_xpath('xpath')

for element in elements:
    try:
        WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.LINK_TEXT, element.text)))
    finally:
        element.click()
Mykola Zotko
  • 15,583
  • 3
  • 71
  • 73
0

You can try like below to click on the hyperlinks containing mail icon. When a click is initiated, a pop up box shows up containing additional information. The following script can fetch the email address from there. It's always a great trouble to dig out anything when svg element are there. I've used BeautifulSoup library in order for the usage of .extract() function to kick out svg element so that the script can reach the content.

from bs4 import BeautifulSoup
from contextlib import closing
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

with closing(webdriver.Chrome()) as driver:
    driver.get("https://www.sciencedirect.com/science/article/pii/S1001841718305011")

    for elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//a[starts-with(@name,'baut')]")))[-2:]:
        elem.click()
        soup = BeautifulSoup(driver.page_source,"lxml")
        [item.extract() for item in soup.select("svg")]
        email = soup.select_one("a[href^='mailto:']").text
        print(email)

Output:

weibingzhang@ecust.edu.cn
junhongqian@ecust.edu.cn
MITHU
  • 113
  • 3
  • 12
  • 41
0

from my understanding, after the element clicked it should wait until author popup appear then extract using details() ?

tags = driver.find_elements_by_css_selector('svg.icon-envelope')

if tags:
    for tag in tags:
        tag.click()
        # wait until author dialog/popup on the right appear
        WebDriverWait(driver, 10).until(
            lambda d: d.find_element_by_class_name('e-address') # selector for email
        )
        try:
            details()
            # close the popup
            driver.find_element_by_css_selector('button.close-button').click()

        except Exception as ex:
            print(ex)
            continue
ewwink
  • 18,382
  • 2
  • 44
  • 54
-1

use the builtin time.sleep() function

from time import sleep

tags = driver.find_elements_by_xpath('//a[@class="author size-m workspace-trigger"]//*[local-name()="svg"]')
 if tags:
  for tag in tags:
    sleep(10)
    tag.click()
mikeg
  • 444
  • 3
  • 13
  • no sir it is in loop, driver should wait up to 10 seconds to click the link, whether succeed or failed to click it should go to next level, so we need to use explicitly or implicitly wait here, but i don't know how to use here for "tag.click()" – scoop realm Jan 05 '19 at 13:58
  • Im sorry, I do not understand what you are trying to do, could you provide a better explanation please? using time.sleep(10) will wait 10 seconds before proceeding to tag.click(), is that not what you want? – mikeg Jan 05 '19 at 14:05
  • Please define "leave and go to the next level", do you mean continue the for loop? Also, what determines how long it should wait. – mikeg Jan 05 '19 at 14:14
  • sir, it is no need to wait exactly 10 seconds, it should wait up to 10 seconds, in this time if it is worked it'll go to next level, if not after 10 seconds it will go to next level. – scoop realm Jan 05 '19 at 14:15
  • this is the search results https://www.sciencedirect.com/search?qs=axonal%20transport&show=25&sortBy=relevance – scoop realm Jan 05 '19 at 14:17
  • here we are giving only 10 seconds to every search result if it is worked, fine it'll come back, if not it should come back after 10 sec and go to the next result, this is what i want – scoop realm Jan 05 '19 at 14:25
  • here is what you want https://stackoverflow.com/questions/21827874/timeout-a-python-function-in-windows/48980413 – mikeg Jan 05 '19 at 14:27
  • yes like this only if Timeout occurred: print 'MyFunc did not execute completely' else: print 'MyFunc completed' – scoop realm Jan 05 '19 at 14:33