What are some good open source java libraries to search and scrape data out of a web page and stick it into a database. For example, suppose I had a page such as:
<tr><td><b>Address:</b></td>
<td colspan=3>123 My Street </td></tr>
"Address:" is the key, but I'm actually trying to get "123 My Street" which has a bunch of html tags and spaces in between. Ideally I want to get the value between the td that follows the string "Address:". It seems like JSoup can do the find, but I didn't see a good example on how to do the offset (I may have missed it). Is there a library that handles key/value?
I'd also be interested in learning about any open source (MIT/Apache) initiatives for UI scripting similar to the Kapow Extraction Browser.
Thanks.