0

How do I parse html text into plain text without attributed string?

This is my code:

(NSString *)convertHTML:(NSString *)html {
    NSScanner *myScanner;
    NSString *text = nil;
    myScanner = [NSScanner scannerWithString:html];
    while ([myScanner isAtEnd] == NO) {
        [myScanner scanUpToString:@"<" intoString:NULL];
        [myScanner scanUpToString:@">" intoString:&text];
        html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@"%@>", text] withString:@""];
    }
    //
    html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
    return html;
}
Neil Masson
  • 2,609
  • 1
  • 15
  • 23
Sagar Daundkar
  • 320
  • 3
  • 13
  • Possible duplicate of [how to remove HTML Tags from NSString in iphone?](http://stackoverflow.com/questions/23757655/how-to-remove-html-tags-from-nsstring-in-iphone) – Larme Jan 08 '16 at 12:40

3 Answers3

0

Assuming you have access to a UIWEbView of some kind, you could execute some javascript to retrieve the .text() of an element which contains the HTML you want to stringify?

Matt Fellows
  • 6,512
  • 4
  • 35
  • 57
0

You can use the below code and get from here

-(NSString *)stringByStrippingHTML:(NSString*)str
{
    NSRange r;
    while ((r = [str rangeOfString:@"<[^>]+>" options:NSRegularExpressionSearch]).location     != NSNotFound)
    {
        str = [str stringByReplacingCharactersInRange:r withString:@""];
    }
    return str;
}

NSString *hstmString = @"This is <font color='red'>simple</font>";

NSString* strWithoutFormatting = [self stringByStrippingHTML:hstmString];

NSLog(@"%@", strWithoutFormatting);

It maybe help you :)

Community
  • 1
  • 1
Mohanraj S K
  • 657
  • 4
  • 16
0

If using a library is an option you could try HTMLKit.

For example, given the following HTML:

<p>Some <b>text</b> to <em>extract</em></p>

one way to parse it to plain text would be:

// create a <div> element
HTMLElement *element = [[HTMLElement alloc] initWithTagName:@"div"];
// set its innerHTML
element.innerHTML = @"<p>Some <b>text</b> to <em>extract</em></p>";
// textContext of the element contains all the text
NSLog(@"%@", element.textContent);
// You get: 'Some text to extract'

Let me know if you need further help.

If your HTML is simple and parsing it is not the core functionality of your app/project, then maybe HTMLKit is not for you, since it is a full-fledged HTML parser.

iska
  • 2,208
  • 1
  • 18
  • 37