0

I want to scrape data from a graph of a particular website. This information in graph is available only if you hover mouse on the graph.But after I scrape, I am unable to see the data in output even though it is visible under 'Inspect Element'.

I have tried to scrape using JSoup but when I scrape the data, the data that changes by hovering mouse is not displayed. How can I do this?

Below is the information which I have to scrape. I have to scrape the dynamically changing value '184'.

The value 184 is dynamically changing when you hover mouse on graph wit h RGB values displyaed in the above line

The value 184 is dynamically changing when you hover mouse on graph wit h RGB values displyaed in the above line. Even these RGB values changes by hovering mouse on graph.

After scraping, the output of document by Jsoup looks like the below: The number 184 and rgb values are not appeared. How are these fields disappeared in output? Does this not appear because it is a dynamic data by mouse hovering?

enter image description here

I actually have to scrape information from the following graph which displays 'Carbon Intensity' value from the graph "Carbon Intensity in the last 24 hours" only by hovering mouse on it.

enter image description here

I am stuck with this problem since two days and has not found any helpful solution. I am using Jsoup on linux.Could some one suggest me how can I do this. Thanks in advance!

Priya
  • 329
  • 3
  • 14
  • Is the data really only loaded when you hover over it? Or is it just hidden until you hover over it? In either case you should have a look at Selenium for this – dustytrash Sep 21 '18 at 12:39
  • It is just hidden until I hover it. It is displayed only once I hover on the time graph. – Priya Sep 21 '18 at 12:49
  • It should be very easy to get the data then. I'm not familiar with Jsoup but in selenium you could load the page once and get the data by the html tag. I'd also suggest checking if whatever website/app this is has an API – dustytrash Sep 21 '18 at 12:53
  • Do you have any idea or Can you suggest me why the data which is supposed to be displayed in output is not shown in fig 2? I mean the value 184 which is present in first image is not displayed in the output image i.e. in figure 2 (After scraping). – Priya Sep 21 '18 at 12:57
  • Probably because theirs some JavaScript function that puts the data there on mouse hover. – dustytrash Sep 21 '18 at 12:58
  • I understand.. Any suggestion on how can I get that data? (Even in selenium is ok) – Priya Sep 21 '18 at 13:07
  • Can we have the URL of this page? – Krystian G Sep 21 '18 at 22:42
  • 1
    Possible duplicate of [Page content is loaded with javascript and Jsoup doesn't see it](https://stackoverflow.com/questions/7488872/page-content-is-loaded-with-javascript-and-jsoup-doesnt-see-it) – luksch Sep 22 '18 at 11:54
  • @KrystianG : https://www.electricitymap.org/?page=country&solar=false&remote=true&wind=false&countryCode=DE – Priya Sep 24 '18 at 07:27

1 Answers1

0

To do that you should use Selenium and add it to Maven if you are using it, or to whatever dependency manager you are using. Once you do that you need to add this .exe (https://github.com/mozilla/geckodriver/releases) to your project folder to get the Firefox support for Selenium, you can also use Google Chrome following this tutorial (https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver).

You have a lot of tutorials on how to force the JS of a web page to get its content, but it could be something like this, to set the mouse over an item from the HTML:

WebDriver webDriver = new FirefoxDriver();
JavascriptExecutor js = (JavascriptExecutor)webDriver;
webDriver.get(URL); // You have to place the URL you are crawling here

Actions action = new Actions(webDriver);
WebElement webElement = webDriver.findElement(By.id("country-emission-rect));

// using By you have a lot more options to select HTML content, I guess you want to place the mouse over that item in particular, but you can change if it it's another one
action.moveToElement(webElement).perform();

WebDriverWait webDriverWait = new WebDriverWait(webDriver, 15); // wait max 15 seconds

// wait until the element with class name: "country-emission-intensity" is loaded
webDriverWait.until(ExpectedConditions.visibilityOfElementLocated(By.className("country-emission-intensity")));

// get the HTML generate after the mouse over that now has the text you want to get
String fullHtml = webDriver.getPageSource();
webDriver.quit();

If you want to keep using JSOUP instead of Selenium for the scrapping you can now do:

Document document = Jsoup.parse(fullHtml);

Remember to place the .exe in your project folder and to install correctly all the Selenium dependencies (enabling auto-import if you are using Maven).

Hope it helped you! If you need anything else feel free to ask!

alvarobartt
  • 453
  • 5
  • 15