22

I had a need to display HTML text inside my iOS app. I have decided I will use the built-in method on NSAttributedString, initWithData:options:documentAttributes:error:. The actual parsing works excellently, however, I seem to have come across a very odd bug, that only seems to manifest itself if I have the debugger attached.

The first time that this method is called, it takes barely under 1 second to run on my iPhone 5S running iOS 7.0.4, and about 1.5 seconds on an iPod Touch 5th generation. The quirk also manifests itself on the simulator, but it is significantly less noticeable, due to the sheer speed of the simulator.

Subsequent calls only take around 10-50ms, which is significantly faster than the initial call.

This doesn't appear to be related to caching of the input string, as I have tested it with multiple input strings in my 'real' application.

However, when I run the program without the debugger, it runs as expected, taking about 10-20ms, which is what I expect HTML parsing to take.

Here is the relevant section of code:

-(void) benchmarkMe:(id)sender {
    NSData *data = [testString dataUsingEncoding:NSUTF8StringEncoding];

    NSTimeInterval startTime = [[NSDate date] timeIntervalSinceReferenceDate];

    // So the complier doesn't keep complaining at me.
    __attribute__((unused))
    NSAttributedString *parsed = [[NSAttributedString alloc] initWithData:data
                                                                  options:@{
                                                                        NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
                                                                        NSCharacterEncodingDocumentAttribute: @(NSUTF8StringEncoding)
                                                                    }
                                                       documentAttributes:nil
                                                                    error:nil];

    NSTimeInterval endTime = [[NSDate date] timeIntervalSinceReferenceDate];

    NSString *message = [NSString stringWithFormat:@"Took %lf seconds.", endTime - startTime];

    UIAlertView *alertView = [[UIAlertView alloc] initWithTitle:@"Benchmark complete!"
                                                        message:message
                                                       delegate:nil
                                              cancelButtonTitle:@"Ok"
                                              otherButtonTitles:nil];
    [alertView show];
}

Note: A fully working project demonstrating this bug is available here:
https://github.com/richardjrossiii/NSAttributedStringHTMLBug

Am I crazy? Is there something I'm missing here? 1 second is an awfully large amount of time when I'm trying to optimize my app for performance.

My current solution is to parse a 'dummy' string on application launch, but this seems like an incredibly hacky workaround.

Johannes Fahrenkrug
  • 42,912
  • 19
  • 126
  • 165
Richard J. Ross III
  • 55,009
  • 24
  • 135
  • 201

1 Answers1

34

That's a really good question. It turns out that (at least for me) it is always slower the first time I call the method, no matter if the debugger is attached or not. Here is why: The first time you parse an HTML-attributed string, iOS loads a whole JavaScriptCore engine and WebKit into memory. Watch:

The first time we run the method (before parsing the string) only 3 threads exist:

screenshot 1

After the string is parsed, we have 11 threads:

screenshot 2

Now the next time we run the method, most of those web-related threads are still in existence:

screenshot 3

That explains why it's slow the first time and fast thereafter.

Iulian Onofrei
  • 9,188
  • 10
  • 67
  • 113
Johannes Fahrenkrug
  • 42,912
  • 19
  • 126
  • 165
  • 1
    Why in the world would it use JavaScriptCore? Seems like a terrible idea... Anyhow, after doing some instruments profiling of my own, it agrees with your discoveries. – Richard J. Ross III Jan 16 '14 at 17:06
  • 2
    @RichardJ.RossIII I have no idea why it is loading JSC. I understand that it is loading WebKit, because parsing an HTML string is expensive: a DOM tree has to be built, parsed and (possibly even) CSS styles applied. So even if just `bold` is being parsed, a whole HTML parsing machinery has to be fired up. – Johannes Fahrenkrug Jan 16 '14 at 18:59
  • 2
    Incredibly interesting observations here guys. Nice examination – mattsven Mar 22 '15 at 17:52