2

Given a HTML website which displays a temperature outside and other unimportant peaces of information:

<div style="">15</div>

15 - is my destination number, which I want to extract as a variable.

Now what I want to do is, that Java program will go to the website, search for the particular HTML code line (temperature=15;) and after it is found, it must display it like this: https://i.stack.imgur.com/lY0qi.jpg

All I want to know, what syntax should I use to let program request that number.

Thomas Uhrig
  • 30,811
  • 12
  • 60
  • 80
user3342072
  • 137
  • 6

2 Answers2

1

Extracting information from a website is called crawling or scraping.

You basically go to the web site, get the HTML source and search it for your element. You can search with a regular expression or (more common) with a parser like Jsoup.

You will find a lot of working examples on the official site of Jsoup (e.g. http://jsoup.org/cookbook/extracting-data/example-list-links). Jsoup will parse the HTML source into a DOM-like structure with elements and nodes. You can search for specific nodes, e.g. for all DIV elements. Then you can iterate over them and get your temperature.

Thomas Uhrig
  • 30,811
  • 12
  • 60
  • 80
1

There are tools called scraper that extract information from the web .thare are many Java API that let you write your own scraper. You can try with JSoup ,HTMLUnit or Jaunt .

Mifmif
  • 3,132
  • 18
  • 23