-1

What I'm trying to accomplish read text (html) from a website that i have entered and stored in str1. I have been able to open the website and print all of the html code inside. but what I want to do is only print words between <title><\title> so i can print the title of the page.

URL oracle = new URL(str1);
    BufferedReader in = new BufferedReader(
    new InputStreamReader(oracle.openStream()));

    String inputLine;
    while ((inputLine = in.readLine()) != null)
        System.out.println(inputLine);
    in.close();
PM 77-1
  • 12,933
  • 21
  • 68
  • 111
user3250337
  • 21
  • 1
  • 6
  • 2
    You need an HTML parser. – SLaks Oct 12 '14 at 17:33
  • @SLaks - Since OP seemingly needs to parse just `` which is used no more than once per page and always inside ``, it can be done with rather simple crude logic. – PM 77-1 Oct 12 '14 at 17:35
  • the question title leaves this question open ended as being `generalized` by stating *specific items such as*, implies that `` is an example, not the only thing, if that is the case the only correct answer is **use an html parser**. –  Oct 12 '14 at 17:42

1 Answers1

0

You could use a StringBuilder to read out all lines and append them to have the websites source in one string. Than you can easily search for <title> or </title> in this string.

Look at String functions like indexOf or split, to get exactly what's between those tags.

I'd recommend reading this. http://docs.oracle.com/javase/6/docs/api/java/lang/String.html

PM 77-1
  • 12,933
  • 21
  • 68
  • 111
René Jahn
  • 1,155
  • 1
  • 10
  • 27