6

This post is quite similar to this one: Using selenium and python to extract data when it pops up after mouse hover

But I was unable to find the answer I sought.

Im trying to webscrape a leaflet map very similar to this one: https://leafletjs.com/examples/choropleth/, ideally I'll like to download all the information appearing after moving the mouse over the polygones:

Original post looped over every circle element, I'll like to do the same over every polygon.

Code trials:

from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome
driver.get("https://leafletjs.com/examples/choropleth/")
timeout = 1000

explicit_wait30 = WebDriverWait(driver, 30)
try:
    # Wait for all circles to load
    poli = explicit_wait30.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.leaflet-interactive')))
except TimeoutException:
    driver.refresh()


data = []
i=1
for circle in poli:
    i+=1
    # Execute mouseover on the element
    driver.execute_script("const mouseoverEvent = new Event('mouseover');arguments[0].dispatchEvent(mouseoverEvent)", poli)
    # Wait for the data to appear
    listing = explicit_wait30.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '#listingHover')))
    data.append(listing.text)
    # Close the listing
    driver.execute_script("arguments[0].click()", listing.find_element_by_tag_name('button'))
    print(i)
    if i>15 : 
        break

I get error :

JavascriptException: Message: javascript error: arguments[0].dispatchEvent is not a function
  (Session info: chrome=85.0.4183.102)

Seems like "leaflet-interactive" elements don't have events type mouse over, how can I reproduce human action of moving mouse over the polygons?

  • I'm pretty sure WebDriverWait doesn't return a list of all the elements. All it does is just block until all elements are found. To find the actual elements, you're probably going to have to use find_elements_by_css_selector. Try this first, and see what happens. – fooiey Sep 10 '20 at 14:35

2 Answers2

2

To webscrape the leaflet map and extract all the information appearing after moving the mouse over the polygons, as the desired elements are within an <iframe> so you have to:

  • Induce WebDriverWait for the desired frame to be available and switch to it.

  • Induce WebDriverWait for the desired visibility_of_element_located.

  • You can use the following Locator Strategies:

    driver.get('https://leafletjs.com/examples/choropleth/')
    driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h2[text()='Interactive Choropleth Map']"))))
    WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[src='example.html']")))
    elements = driver.find_elements_by_css_selector("svg.leaflet-zoom-animated>g path")
    for element in elements:
        ActionChains(driver).move_to_element(element).perform()
        print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='info leaflet-control']"))).text)
    
  • Note : You have to add the following imports :

    from selenium.webdriver.common.action_chains import ActionChains
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Console Output:

    US Population Density
    Alabama
    94.65 people / mi2
    US Population Density
    Hover over a state
    US Population Density
    Arizona
    57.05 people / mi2
    US Population Density
    Arkansas
    56.43 people / mi2
    US Population Density
    California
    241.7 people / mi2
    US Population Density
    Colorado
    49.33 people / mi2
    US Population Density
    Connecticut
    739.1 people / mi2
    US Population Density
    Delaware
    464.3 people / mi2
    US Population Density
    Maryland
    596.3 people / mi2
    US Population Density
    Hover over a state
    US Population Density
    Georgia
    169.5 people / mi2
    US Population Density
    Hover over a state
    US Population Density
    Montana
    6.858 people / mi2
    US Population Density
    Illinois
    231.5 people / mi2
    US Population Density
    Indiana
    181.7 people / mi2
    US Population Density
    Iowa
    54.81 people / mi2
    US Population Density
    Kansas
    35.09 people / mi2
    US Population Density
    Kentucky
    110 people / mi2
    US Population Density
    Mississippi
    63.5 people / mi2
    US Population Density
    Maine
    43.04 people / mi2
    US Population Density
    Virginia
    204.5 people / mi2
    US Population Density
    Massachusetts
    840.2 people / mi2
    US Population Density
    

Reference

You can find a couple of relevant discussions in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0

Instead of making it flacky, overhead and lengthy execution, just grab the url where all the data is avail and parse it.

Url = 'https://leafletjs.com/examples/choropleth/us-states.js'

response = requests.get(Url)

json_data = json.loads(response.text.split("=")[1].replace(';',''))

prop = jsonpath.jsonpath(json_data,"$.features[*].properties")
print(prop)

Imports needed -

import requests
import json
import jsonpath

Output:-

[{'name': 'Alabama', 'density': 94.65}, {'name': 'Alaska', 'density': 1.264}, {'name': 'Arizona', 'density': 57.05}, {'name': 'Arkansas', 'density': 56.43}, {'name': 'California', 'density': 241.7}, {'name': 'Colorado', 'density': 49.33}, {'name': 'Connecticut', 'density': 739.1}, {'name': 'Delaware', 'density': 464.3}, {'name': 'District of Columbia', 'density': 10065}, {'name': 'Florida', 'density': 353.4}, {'name': 'Georgia', 'density': 169.5}, {'name': 'Hawaii', 'density': 214.1}, {'name': 'Idaho', 'density': 19.15}, {'name': 'Illinois', 'density': 231.5}, {'name': 
'Indiana', 'density': 181.7}, {'name': 'Iowa', 'density': 54.81}, {'name': 'Kansas', 'density': 35.09}, {'name': 'Kentucky', 'density': 110}, {'name': 'Louisiana', 'density': 105}, {'name': 'Maine', 'density': 43.04}, {'name': 'Maryland', 'density': 596.3}, {'name': 'Massachusetts', 'density': 840.2}, {'name': 'Michigan', 'density': 173.9}, {'name': 'Minnesota', 'density': 67.14}, {'name': 'Mississippi', 'density': 63.5}, {'name': 'Missouri', 'density': 87.26}, {'name': 'Montana', 'density': 6.858}, {'name': 'Nebraska', 'density': 23.97}, {'name': 'Nevada', 'density': 24.8}, {'name': 'New Hampshire', 'density': 147}, {'name': 'New Jersey', 'density': 1189}, {'name': 'New Mexico', 'density': 17.16}, {'name': 'New York', 'density': 412.3}, {'name': 'North Carolina', 'density': 198.2}, {'name': 'North Dakota', 'density': 9.916}, {'name': 'Ohio', 'density': 281.9}, {'name': 'Oklahoma', 'density': 55.22}, {'name': 'Oregon', 'density': 40.33}, {'name': 'Pennsylvania', 'density': 284.3}, {'name': 'Rhode Island', 'density': 1006}, {'name': 'South Carolina', 'density': 155.4}, {'name': 'South Dakota', 'density': 98.07}, {'name': 'Tennessee', 'density': 88.08}, {'name': 'Texas', 'density': 98.07}, {'name': 'Utah', 'density': 34.3}, {'name': 'Vermont', 'density': 67.73}, {'name': 'Virginia', 'density': 204.5}, {'name': 'Washington', 'density': 102.6}, {'name': 'West Virginia', 'density': 77.06}, {'name': 'Wisconsin', 'density': 105.2}, {'name': 'Wyoming', 'density': 5.851}, {'name': 'Puerto Rico', 'density': 1082}]
Dev
  • 2,739
  • 2
  • 21
  • 34