0

I'm making a Java web application that finds text between tags using a .html file.

Example:

<title>Example</title>

Now the web application would open the .html file and find the text that is between the <title> tags which would be "Example".

Anyone have any ideas?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
islandcraz
  • 31
  • 1
  • 2
  • 2
    There are number of html parsers available. Check them out [here](http://java-source.net/open-source/html-parsers). Ofcourse there is always regex if that does not meet your need. – CoolBeans Jul 12 '11 at 18:37
  • Similar Q : http://stackoverflow.com/questions/5625888/how-to-get-text-other-tags-between-specific-tags-using-jericho-html-parser – Saurabh Gokhale Jul 12 '11 at 18:39
  • @CoolBeans: Ah, yes. Because everyone knows you should [parse HTML with regex.](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – Robert Harvey Jul 12 '11 at 18:58
  • 1
    @Close voters: You're going to have to do better than that. "Not a Real Question" is not a synonym for "I don't like the question." – Robert Harvey Jul 12 '11 at 18:59
  • Regexpes ONLY work for simple cases. – Thorbjørn Ravn Andersen Jul 12 '11 at 18:59
  • @Robert Harvey: As I mentioned the first option is to use the HTML parsers. RegEx as I said is an option if someone really did not want to use one of the open source parser tools. – CoolBeans Jul 12 '11 at 19:26
  • @Robert Harvey: That's a pretty highly voted answer. Good to read. Thanks for sharing :) – CoolBeans Jul 12 '11 at 19:28

1 Answers1

0

You won't find detailed answer to go about doing this but high-level ideas. So here is my idea, write a servlet that takes a URL as a parameter and then use the jsoup lib to fetch that url and parse the title tag. Very easy if you use a good parser like jsoup.

Jsoup.connect("http://en.wikipedia.org/").get().select("title")
Amir Raminfar
  • 33,777
  • 7
  • 93
  • 123
  • Please consider an [RFC 2606](http://tools.ietf.org/html/rfc2606) compliant example, such as [http://www.example.com](http://www.example.com). – trashgod Jul 12 '11 at 19:04
  • @trashgod, not sure what you mean? there is a link on the post that links to jsoup. is that what you mean? – Amir Raminfar Jul 12 '11 at 20:40