Possible Duplicate:
How to create a Java String from the contents of a file

I'm stuck with the problem of loading a whole file(which is a .html file) to a single String.

I'm trying to print the contents which between <body> and </body>. however; when i run my code, it cannot write anything in to input file. i believe the problem is there is no <body> or </body> tags in my first line, which means the indexOf() method will return a -1, therefore the whole problem cannot be achieve. Someone told me should load the whole .html which contains a lot of lines to a single string, i believe he means to load in one line. I DO NOT KNOW HOW TO DO IT...

here is my code:

PrintWriter pr;
  c = new Scanner(f);
pr = new PrintWriter(new FileOutputStream(o));
while (c.hasNextLine()){
  String text = c.nextLine();
    String index = "<body>";
    String index2 = "</body>";
    int i1 = text.indexOf(index);
    int i2 = text.indexOf(index2);
    text = text.substring(i1+6,i2);
    System.out.println("here it is");
    System.out.println("you did !!!");
}catch(Exception e){}


  • 1
  • 1
  • 19
  • 1
  • 2

1 Answers1


Ok. Here is a small sample to help you start.

This will work will a simple HTML page.
But for complex pages (with embedded CSS etc) you will have to figure how to find the start/end of body.

FileInputStream fin = new FileInputStream("theFile.html");     
byte[] data = new byte[fin.available()];    
String htmlFile = new String(data);  
int start = htmlFile.indexOf("<body>");  
if(start != -1){   
  int end = html.indexOf("</body");  
  if(end != -1){  

     System.out.println("Body is: html.substring(start + 6, end));  
  • 52,998
  • 69
  • 209
  • 339