I'd really use some help with scraping the data from the line or donut charts on this website. I need this data for a study project focusing on forecasting solar and wind production in the Netherlands.
I'd like to use Python for the task and I'd attempted doing so using Selenium.
Data is stored in canvas elements, which makes this a bit more challenging than expected and I'd use some help with figuring out the right approach to extract the data. Any help doing this would be much appreciated.
My approach till now has been to locate the line-chart element and then 'move the mouse' (using Selenium Actions and move_to_element_with_offset function) over the charts from left to right.
For each step, I'd record the data that will be available in the hover text and somehow link that to the right timestamp.
See here for a screen-shot of how it looks in my browser. Note how the Zonne energie data value appears in the div below when hovering :
The problem is, however, that I'm not able to receive the data in the page source. Probably because I'm not not able to figure out how to hover the mouse over the chart using Selenium.
My initial code is:
chrome_driver_path = pathlib.Path(__file__).parent / "chromedriver"
options = webdriver.ChromeOptions()
options.add_argument('headless')
driver = webdriver.Chrome(executable_path=chrome_driver_path,options=options)
url = "https://energieopwek.nl"
driver.get(url)
line_chart=driver.find_element(By.ID,"linechart_1")
action.move_to_element(line_chart).click().perform() # clicking on the chart
soup = BeautifulSoup(driver.page_source, 'lxml')
print(soup.prettify()) # I'd expect to see the data in the page source, but it's not
Here is the page source output. I'd have expected data from the chart to be present in the divs, as in the screen-shot above:
<div _echarts_instance_="ec_1652165210746" class="eo-chart" id="linechart_1" style="-webkit-tap-highlight-color: transparent; user-select: none; position: relative; background: rgba(0, 0, 0, 0);">
<div style="position: relative; overflow: hidden; width: 744px; height: 385px; padding: 0px; margin: 0px; border-width: 0px; cursor: default;">
<canvas data-zr-dom-id="zr_0" height="385" style="position: absolute; left: 0px; top: 0px; width: 744px; height: 385px; user-select: none; -webkit-tap-highlight-color: rgba(0, 0, 0, 0); padding: 0px; margin: 0px; border-width: 0px;" width="744">
</canvas>
</div>
<div>
--- WHERE IS THE DATA?---
</div>
</div>
Curious to hear if anybody is able to help me here ?