1

When I go to the web address in the code I don't get the contents from "Synonyms" section. It does the selection, but takes it as a list and does not output the text content.

synonyms= []
driver= webdriver.Chrome()
url = "https://pubchem.ncbi.nlm.nih.gov/compound/71308229"
driver.get(url)
synonym = driver.find_elements_by_class_name("overflow-x-auto")
synonyms.append(synonym)
driver.close()
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

3 Answers3

0

You need to get element's text explicitly

synonyms= []
driver= webdriver.Chrome()
url = "https://pubchem.ncbi.nlm.nih.gov/compound/71308229"
driver.get(url)
synonym = driver.find_elements_by_class_name("overflow-x-auto")
synonyms.append([s.text for s in synonym])
print(synonyms)
driver.close()

Output

[['Lanthanum boride\n12008-21-8\nLanthanum hexaboride\nMFCD00151350\nB6La\nMore...', 'Lanthanum boride\n12008-21-8\nLanthanum hexaboride\nMFCD00151350\nB6La\nLanthanum Hexaboride Nanoparticles\nLanthanum boride, 99.5% (REO)\nIron Boride (FeB) Sputtering Targets\nFT-0693450\nLanthanum hexaboride, powder, 10 mum, 99%\nY1387\nLanthanum hexaboride LaB6 GRADE A (H?gan?s)\nLanthanum hexaboride, powder, -325 mesh, 99.5% metals basis\nLanthanum boride, powder, -325 mesh, 99.5% trace metals basis\nLine position and line shape standard for powder diffraction, NIST SRM 660c, Lanthanum hexaboride powder', 'Property Name Property Value Reference\nMolecular Weight 203.8 Computed by PubChem 2.1 (PubChem release 2021.05.07)\nHydrogen Bond Donor Count 0 Computed by Cactvs 3.4.8.18 (PubChem release 2021.05.07)\nHydrogen Bond Acceptor Count 2 Computed by Cactvs 3.4.8.18 (PubChem release 2021.05.07)\nRotatable Bond Count 0 Computed by Cactvs 3.4.8.18 (PubChem release 2021.05.07)\nExact Mass 203.965826 Computed by PubChem 2.1 (PubChem release 2021.05.07)\nMonoisotopic Mass 204.962194 Computed by PubChem 2.1 (PubChem release 2021.05.07)\nTopological Polar Surface Area 0 Ų Computed by Cactvs 3.4.8.18 (PubChem release 2021.05.07)\nHeavy Atom Count 7 Computed by PubChem\nFormal Charge -2 Computed by PubChem\nComplexity 132 Computed by Cactvs 3.4.8.18 (PubChem release 2021.05.07)\nIsotope Atom Count 0 Computed by PubChem\nDefined Atom Stereocenter Count 0 Computed by PubChem\nUndefined Atom Stereocenter Count 0 Computed by PubChem\nDefined Bond Stereocenter Count 0 Computed by PubChem\nUndefined Bond Stereocenter Count 0 Computed by PubChem\nCovalently-Bonded Unit Count 2 Computed by PubChem\nCompound Is Canonicalized Yes Computed by PubChem (release 2021.05.07)', 'Mixtures, Components, and Neutralized Forms 2 Records\nSimilar Compounds 2 Records', 'Same 25 Records']]
Yevhen Bondar
  • 4,357
  • 1
  • 11
  • 31
0
  1. You are missing a wait / delay.
  2. You have to extract the text from the web element(s)
  3. Looks like you are using a wrong locator

I guess this will give you what you are looking for:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

synonyms= []
driver= webdriver.Chrome()
url = "https://pubchem.ncbi.nlm.nih.gov/compound/71308229"
driver.get(url)
wait = WebDriverWait(driver, 20)
wait.until(EC.visibility_of_element_located((By.XPATH, "//div[@class='overflow-x-auto']//p")))
time.sleep(0.1)
elements = driver.find_elements_by_xpath("//div[@class='overflow-x-auto']//p")
for el in elements:
    synonyms.append(el.text)
driver.close()
Prophet
  • 32,350
  • 22
  • 54
  • 79
0

To extract the contents from the Synonyms table you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use the following Locator Strategy:

  • Using XPATH:

    driver.get("https://pubchem.ncbi.nlm.nih.gov/compound/71308229")
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//th[text()='Synonyms']//following::td[1]//p")))])
    
  • Console Output:

    ['Lanthanum boride', '12008-21-8', 'Lanthanum hexaboride', 'MFCD00151350', 'B6La']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352