Developer of ParseKit here. Make sure you are using head of trunk on google code.
- Assembler callbacks now have two arguments.
- By default, the string
<title>
will not be tokenized as a single Symbol token. That would be one <
Symbol token, one title
Word token and one >
Symbol token. You could configure that behavior, however.
Please read the documentation on ParseKit, particularly the tokenization docs to understand how tokenization in ParseKit works.
Here's what's missing to accomplish your basic task above. However, I'm not sure this is the best approach for a real world task. I think reading the docs mentioned above would help explain that.
PKTokenizer *t = [PKTokenizer tokenizerWithString:@"<title>foobar</title>"];
[t.symbolState add:@"<title>"];
[t.symbolState add:@"</title>"];
PKAssembly *a = [PKTokenAssembly assemblyWithTokenizer:t];
PKSequence *p = [PKSequence sequence];
[p add:[PKSymbol symbolWithString:@"<title>"]];
PKWord *word = [PKWord word];
[word setAssembler:self selector:@selector(parser:didMatchWord:)];
[p add:word];
[p add:[PKSymbol symbolWithString:@"</title>"]];
PKAssembly *result = [p bestMatchFor:a];
-(void)parser:(PKParser *)p didMatchWord:(PKAssembly *)a {
NSLog(@"%s %@", __PRETTY_FUNCTION__, a);
}