1

I am trying to scrape highcharts from two different websites, I came across this execute_script answer in this stackoverflow question : How to scrape charts from a website with python?

It helped me scrape from the first website but when i use it on the second website it returns the following error:

line 27, in <module>
    temp = driver.execute_script('return window.Highcharts.charts[0]'

selenium.common.exceptions.JavascriptException: Message: javascript error: Cannot read 
property '0' of undefined

The website is : http://lumierecapital.com/#

You're supposed to click on the performance button on the left to get the highchart.

Goal: i just want to scrape the Date and NAV per unit values from it

Like the last website, this code should've printed out a dict with X and Y as keys and the date and data as values but it doesn't work for this one.

Here's the python code:

from bs4 import BeautifulSoup
import requests
from selenium.webdriver.chrome.options import Options
from shutil import which
from selenium import webdriver
import time

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_path = which("chromedriver")
driver = webdriver.Chrome(executable_path=chrome_path, options=chrome_options)
driver.set_window_size(1366, 768)

driver.get("http://lumierecapital.com/#")

performance_button = driver.find_element_by_xpath("//a[@page='performance']")

performance_button.click()

time.sleep(7)

temp = driver.execute_script('return window.Highcharts.charts[0]'
                            '.series[0].options.data')

for item in temp:
    print(item)


1 Answers1

0

You can use re module to extract the values of performance chart:

import re
import requests


url = 'http://lumierecapital.com/content_performance.html'
html_data = requests.get(url).text

for year, month, day, value, datex in re.findall(r"{ x:Date\.UTC\((\d+), (\d+), (\d+)\), y:([\d.]+), datex: '(.*?)' }", html_data):
    print('{:<10} {}'.format(datex, value))

Prints:

30/9/07    576.092
31/10/07   577.737
30/11/07   567.998
31/12/07   556.670
31/1/08    460.886
29/2/08    496.740
31/3/08    484.016
30/4/08    523.829
31/5/08    546.661
30/6/08    494.067
31/7/08    475.942
31/8/08    389.147
30/9/08    299.661
31/10/08   183.690
30/11/08   190.054
31/12/08   211.960
31/1/09    193.308

... and so on.
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91