It's a stream parser, so as it parses it tells you what it hits. You should extend HTMLEditorKit.ParserCallback
with some class (I'll call it Parser
), then override the methods you care about.
I believe it only works for "the html dtd in swing" (see here). If you're doing anything more complicated recommend you instead use an external Java HTML parsing library, such as one of the ones I linked to before.
Here's the basic code (demo):
import javax.swing.text.html.parser.*;
import javax.swing.text.html.*;
import javax.swing.text.*;
import java.io.*;
class Parser extends HTMLEditorKit.ParserCallback
{
private boolean inTD = false;
public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos)
{
if(t.equals(HTML.Tag.TD))
{
inTD = true;
}
}
public void handleEndTag(HTML.Tag t, int pos)
{
if(t.equals(HTML.Tag.TD))
{
inTD = false;
}
}
public void handleText(char[] data, int pos)
{
if(inTD)
{
doSomethingWith(data);
}
}
public void doSomethingWith(char[] data)
{
System.out.println(data);
}
}
class HtmlTester
{
public static void main (String[] args) throws java.lang.Exception
{
ParserDelegator pd = new ParserDelegator();
pd.parse(new BufferedReader(new InputStreamReader(System.in)), new Parser(), false);
}
}