5

My goal is to parse a large XML file (20 GB) with Swift. There are some performance issues with NSXMLParser and bridging to Swift objects, so I'm looking at multi-threading. Specifically the following division:

  1. Main thread - parses data
  2. Worker thread - casts ObjC types into Swift types and sends to 1. The casting of ObjC NSDictionary to [String: String] is the largest bottleneck. This is also the main reason for separating onto multiple threads.
  3. Worker thread - parses XML into ObjC types - and sends to 2. NSXMLParser is a push-parser, once it starts parsing, you cannot pause it.

The data should be parsed sequentially, so the input ordering should be maintained. My idea is to run an NSRunLoop on both 1 and 2, allowing parallel processing without blocking. According to Apple's documentation, communication between the threads can be achieved by calling performSelector:onThread:withObject:waitUntilDone:. However this symbol is not available in Swift.

I don't think that GCD would fit as a solution. Both worker threads should be long-running processes with new work coming in at random intervals.

How can one achieve the above (e.g. NSRunLoops on multiple threads) using Swift?

Daij-Djan
  • 49,552
  • 17
  • 113
  • 135
Bouke
  • 11,768
  • 7
  • 68
  • 102

1 Answers1

0

I used NSOperation for the first time last month, and it's is a really easy object to subclass, you could either chain them together with completion blocks, or you can set operations to be dependencies of each other so that they're performed sequentially.

It's also pretty easy to communicate with NSOperations by passing in objects to them.

NSHipster: http://nshipster.com/nsoperation/

MathewS
  • 2,267
  • 2
  • 20
  • 31