0

I am learning selenium, I want to get all the image of the sample website, the image used lazyload, and the image will only be displayed when the parent element of the image appears in the visible range of the screen.

If the parent element of the image does not appear in the visible range of the screen, the following code is displayed:

<a class="picture" href="http://new.qq.com/omn/20190405/20190405A0CB58.html" target="_blank"><div class="lazyload-placeholder">终于出手规范融资业务!港证监会规定最高不得超过5倍融资</div></a>

If the parent element of the image appears in the visible range of the screen, the following code is displayed:

<a class="picture" href="http://new.qq.com/omn/20190405/20190405A0CB58.html" target="_blank"><img alt="终于出手规范融资业务!港证监会规定最高不得超过5倍融资" src="//inews.gtimg.com/newsapp_ls/0/8439863897_294195/0"></a>

I want to control the speed of scrolling to the bottom, so that the image will all be displayed.

How to control the speed of scrolling to the bottom in selenium?

I am trying to modify window.scrollTo(0, document.body.scrollHeight);,

but it not success.

#coding:utf-8
import time
from selenium import webdriver
from selenium.webdriver.common.by import By


driver = webdriver.Chrome()

driver.get("https://new.qq.com/rolls/?ext=news")

i = 0
while (i < 10):
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(1)
    i += 1


hello123
  • 93
  • 1
  • 8

1 Answers1

1

Updated. Added some code. thank you @Sers.

Here is example how you can get news details like title and img link, check comments inside the code:

#coding:utf-8
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.common.action_chains import ActionChains


driver = webdriver.Chrome()

driver.get("https://new.qq.com/rolls/?ext=news")

wait = WebDriverWait(driver, 10)


# Scroll until load more button will have "没有更多了" text
while True:
    driver.execute_script("arguments[0].scrollIntoView();",  driver.find_element_by_id("load-more"))
    if driver.find_element_by_id("load-more").text == "没有更多了":
        break

# list of maps
results = []



# Gel all news and iterate
news = wait.until(ec.presence_of_all_elements_located((By.CSS_SELECTOR, "ul.list li")))
for item in news:
    # scroll to each news
    driver.execute_script("arguments[0].scrollIntoView();", item)
    # get title
    title = item.find_element_by_css_selector("h3 a").text.strip()
    # wait until a.picture element will have visible img
    img = wait.until(ec.visibility_of(item.find_element_by_css_selector("a.picture img")))

    # add news details to the result
    results.append({"title": title, "href": item.get_attribute("href"), "img": img.get_attribute("src")})

for result in results:
    print(f"title: {result['title']}, img: {result['img']}")

driver.quit()

hello123
  • 93
  • 1
  • 8
Sers
  • 12,047
  • 2
  • 12
  • 31