-9

I'm developing an android application where in a website I programmatically submit data into search box and retrieve results by Java.

I get the data by using URLConnect Java. I get the source code ie html code......

Urlconnection a = .connect to host

getinputstream

read data

I use these functions now if the site has content like:

sahil
3/5 patel chowk
965955

since these details will be inside html tags i want to extract this information . any idea?

Kazekage Gaara
  • 14,972
  • 14
  • 61
  • 108

3 Answers3

2

Have you had a look at JSoup: http://jsoup.org/ its an HTML parser should do what you need.

David Kroukamp
  • 36,155
  • 13
  • 81
  • 138
0

in my guess using regular expression in this case will be a good fit to you: How to use regular expressions to parse HTML in Java?

Community
  • 1
  • 1
christian.vogel
  • 2,117
  • 18
  • 22
  • but since page is very big. i need to extract a very small area..how to jump their dirct – Sahil Manchanda Jun 17 '12 at 15:25
  • with using libraries like @David already mentioned its easy to have a large page as input. its parsing the whole page for a specific condition you gave the parser as input. so dont worry about that. Just have a look to JSoup and the samples provided on the page. – christian.vogel Jun 17 '12 at 15:31
0

Use JTiddy It is easy to use java library for extracting content from html page.

Sunny
  • 14,522
  • 15
  • 84
  • 129