-2

Possible Duplicate:
Java:XML Parser

I have a XML file, in which i want to get the text only within the specified tags(lets say, only the text between "<HERE> ... </HERE>. Each file have multiple "<HERE>" blocks. How can i get that?

I was using this for normal text files:

 Scanner scanner = new Scanner(file);

            while (scanner.hasNextLine()) {
                String line = scanner.nextLine();
..
}

I want to be able to get only the multiple blocks of text inside the tag.

Community
  • 1
  • 1
John
  • 259
  • 8
  • 19
  • 3
    Why don't you use a Java XML parser? (See the "Related" links on the right of this page.) – Mat Sep 27 '11 at 18:41
  • 1
    try JAXB - jaxb.java.net – TyC Sep 27 '11 at 18:42
  • 1
    My advice: don't do what you're doing right now. Instead, google for stuff like "JAXB", "STAX", "Xerces" and you'll find that there are already made solutions for this kind of problem. – darioo Sep 27 '11 at 18:43
  • why not google for the subject of this question and start from there, and then you might have some specific answerable questions about a specific answerable problem. –  Sep 27 '11 at 18:46
  • hm im sorry if i was not clear enought. It is mandatory i do my own function for that, i cannot use already existing functions. So basically what i'm learning is how is it done and not how to do it. – John Sep 27 '11 at 18:52
  • You can still google for those solutions, and read their source code to learn how to do it. I think the point of the comments here, is that this exact functionality has already been built 1,000 times, so a search should get you a project that has source available - which is the answer to your question. – jefflunt Sep 27 '11 at 18:54

2 Answers2

1

I would type a long response about XML parsing in Java, but one of the best quick reads on it which I cannot beat is this Dzone article:

http://refcardz.dzone.com/refcardz/using-xml-java

Explains all you need to know in just a few pages. Definitely worth a read.

Michael Berry
  • 70,193
  • 21
  • 157
  • 216
0

While there's better answers, without the fundamentals you'll not appreciate them.

Learn SAX parsing. Basically the parser will call your class when entering and exiting tags. You just need to keep track of the depth, or where you are in the document, check the tag names, and capture the text you want in a StringBuilder buffer. After the parser is complete, you do a toString() on the buffer and get your combined text.

Later on, learn DOM parsing. Then learn XPath. However, without learning how to parse XML using an XML parser you will burn through way too much time and brainpower attempting to solve a problem badly. Building a parser from scratch isn't impossible; however, it is stealing away from your time solving the problem at hand (and odds are you don't know enough about XML yet to parse it correctly).

Edwin Buck
  • 69,361
  • 7
  • 100
  • 138