I'm trying to crawl a webpage in Java, and I need to search the page for URL's and file paths, that could be relative, or absolute. (eg. ../../file.gif or http://hostname.com/file.gif). Not all of these will have html tags around then like <a href>
, since some of the file paths may be embedded in some javascript.
If anyone can point me in the right direction that would be fantastic.