0
<description>
SEBI : Decision taken by a listed investment company to dispose of a part of its
       investment is not “price sensitive information” within meaning of SEBI
      (Prohibition of Insider Trading) Regulations, 1992<br>;
      By <b>  [2011] 15 taxmann.com 229 (SAT)</b> 
</description>

This is xml I want to parse data after <br>. I'm able parse before <br> but not able to parse after <br>

This is my handle class code :

package com.exercise;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class RSSHandler extends DefaultHandler {

    final int state_unknown = 0;
    final int state_title = 1;
    final int state_description = 2;
    final int state_link = 3;
    final int state_pubdate = 4;
    int currentState = state_unknown;

    RSSFeed feed;
    RSSItem item;

    boolean itemFound = false;

    RSSHandler(){
    }

    RSSFeed getFeed(){
        return feed;
    }

    @Override
    public void startDocument() throws SAXException {
        // TODO Auto-generated method stub
        feed = new RSSFeed();
        item = new RSSItem();

    }



    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {
        // TODO Auto-generated method stub

        if (localName.equalsIgnoreCase("item")){
            itemFound = true;
            item = new RSSItem();
            currentState = state_unknown;
        }
        else if (localName.equalsIgnoreCase("title")){
            currentState = state_title;
        }
        else if (localName.equalsIgnoreCase("description")){
            currentState = state_description;
        }
        else if (localName.equalsIgnoreCase("link")){
            currentState = state_link;
        }
        else if (localName.equalsIgnoreCase("pubdate")){
            currentState = state_pubdate;
        }
        else{
            currentState = state_unknown;
        }

    }


    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        // TODO Auto-generated method stub
        currentState = state_unknown;
        if (localName.equalsIgnoreCase("item")){
            feed.addItem(item);
        }


    }

    @Override
    public void characters(char ch[], int start, int length)
            throws SAXException {
        //super.characters(ch, start, length);
        // TODO Auto-generated method stub
        StringBuilder buf=new StringBuilder();


        if (buf!=null) {
            for (int i=start; i<start+length; i++) {
                buf.append(ch[i]);


            }

            String strCharacters=buf.toString();





                if (itemFound==true){
        // "item" tag found, it's item's parameter
            switch(currentState){
            case state_title:
                item.setTitle(strCharacters);
                break;
            case state_description:
                item.setDescription(strCharacters);  //here data coming
                break;
            case state_link:
                item.setLink(strCharacters);
                break;
            case state_pubdate:
                item.setPubdate(strCharacters);
                break;  
            default:
                break;
            }

        }

        else{
        // not "item" tag found, it's feed's parameter
            switch(currentState){
            case state_title:
                feed.setTitle(strCharacters);
                break;
            case state_description:
                feed.setDescription(strCharacters);
                break;
            case state_link:
                feed.setLink(strCharacters);
                break;
            case state_pubdate:
                feed.setPubdate(strCharacters);
                break;  
            default:
                break;
            }
        }

        currentState = state_unknown;
    }


}


}
Sathyajith Bhat
  • 21,321
  • 22
  • 95
  • 134
xyz Sad
  • 1
  • 1

4 Answers4

0

&amp; is an XML entity reference and means &.

By default, SAX will do the conversion for you, so if your source XML says hello&goodbye you should see hello&goodbye. go through This link. It might solve ur problem

Community
  • 1
  • 1
Shaireen
  • 3,703
  • 5
  • 28
  • 40
0

Something is wrong with the first text you pasted. Try posting the XML again in code mode (4 spaces in the beginning of each line).

My suspicion is that you're having the xml in url-encoded format and that you'll have to decode it before you start handling it.

Nir Alfasi
  • 53,191
  • 11
  • 86
  • 129
0

As posted that XML is not valid, you will probably need to escape the quotes in the doc as well.

I don't know if that is your issue, but it will be a contributor.

(the quotes are around "price sensitive information")

Tim Jarvis
  • 18,465
  • 9
  • 55
  • 92
0

I think in your case the problem is that you are initializing the StringBuilder inside the characters() so new object is created everytime. Instead of intializing it in characters() try to initialize it in the startElement()

@Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {

         StringBuilder buf=new StringBuilder()
..........
}
Lalit Poptani
  • 67,150
  • 23
  • 161
  • 242