12

I have a long string (sometimes over 1000 characters) that I want to convert to an array of boolean values. And it needs to do this many times, very quickly.

let input: String = "001"
let output: [Bool] = [false, false, true]

My naive attempt was this:

input.characters.map { $0 == "1" }

But this is a lot slower than I'd like. My profiling has shown me that the map is where the slowdown is, but I'm not sure how much simpler I can make that.

I feel like this would be wicked fast without Swift's/ObjC's overhead. In C, I think this is a simple for loop where a byte of memory is compared to a constant, but I'm not sure what the functions or syntax is that I should be looking at.

Is there a way to do this much faster?

UPDATE:

I also tried a

output = []
for char in input.characters {
    output.append(char == "1")
}

And it's about 15% faster. I'm hoping for a lot more than that.

Alex Wayne
  • 178,991
  • 47
  • 309
  • 337
  • check with raw for..in – dimpiax Mar 18 '16 at 00:51
  • @dimpiax How so exactly? I edited the question with an attempt at a manual `for` loop, and it does help a little. – Alex Wayne Mar 18 '16 at 01:05
  • 2
    Sample size of "001" is a bit small for actual measurable differences. Can you provide a larger sample set? You also couldn't possible have measured any difference in the time it takes to loop over 3 characters. (Debugger attached? invalid results!) – Claus Jørgensen Mar 18 '16 at 01:08
  • Also, if you do a (bridge free) cast to NSString, you can use `.UTF8String()` to get a array of `const char *` which effectively is the same as a array of booleans, if you assume it's always 0 or 1 – Claus Jørgensen Mar 18 '16 at 01:10
  • My actual data is bunch of these from 5 to 1200 characters long. – Alex Wayne Mar 18 '16 at 01:10
  • some time ago I read about different ways of compiling swift, which sometimes makes an enormous performance difference. Will try to find it – Fr4nc3sc0NL Mar 18 '16 at 01:12
  • Have you tried `output.reserveCapacity(input.chracters.count)`? – JCorcuera Mar 18 '16 at 01:16
  • Probably not a good idea, but you could try to compile with -Ofast: http://stackoverflow.com/questions/24101718/swift-performance-sorting-arrays (removes alot of overhead) – Fr4nc3sc0NL Mar 18 '16 at 01:16
  • At the end of the day, Swift can directly interop with C, so you could always just write a C function to do it if performance we that important. – nhgrif Mar 18 '16 at 01:17
  • @AlexWayne: just a question: why are you trying this kind of conversion? –  Mar 18 '16 at 07:31
  • @IanBell I'm consuming a JSON API (which we control), and it delivers a lot of objects that contains info about thousands of points that can be completed, or not completed. We first tried to do encode that in JSON `[true, false]` and found it to be slow on the client and the server. Joining the values into a string doubled the speed of both the server rendering, and the client parsing. But client side parsing was still too slow, hence this question. – Alex Wayne Mar 18 '16 at 18:05

8 Answers8

13

This is faster:

// Algorithm 'A'
let input = "0101010110010101010"
var output = Array<Bool>(count: input.characters.count, repeatedValue: false)
for (index, char) in input.characters.enumerate() where char == "1" {
    output[index] = true
}

Update: under input = "010101011010101001000100000011010101010101010101"

0.0741 / 0.0087, where this approach is faster that author's in 8.46 times. With bigger data correlation more positive.

Also, with using nulTerminatedUTF8 speed a little increased, but not always speed higher than algorithm A:

// Algorithm 'B'
let input = "10101010101011111110101000010100101001010101"
var output = Array<Bool>(count: input.nulTerminatedUTF8.count, repeatedValue: false)
for (index, code) in input.nulTerminatedUTF8.enumerate() where code == 49 {
    output[index] = true
}

In result graph appears, with input length 2196, where first and last 0..1, A – second, B – third point. A: 0.311sec, B: 0.304sec

Algorithm comparison graph

dimpiax
  • 12,093
  • 5
  • 62
  • 45
  • _Why_ is this faster? Is it because we create the whole final array _first_ and then _modify_ it rather than enlarging it as we iterate the original array? If so, it seems to me that that's a flaw in the underlying implementation of `map` and should be reported as a bug. – matt Mar 20 '16 at 01:36
  • @matt Nope, the main idea that you write value not for each index, when `map` does. – dimpiax Mar 20 '16 at 01:38
  • What optimization settings was the test compiled with? – Catfish_Man Mar 20 '16 at 02:20
  • @Catfish_Man test in playground – dimpiax Mar 20 '16 at 10:06
  • 1
    Ah, I would strongly suggest retesting outside a playground. They have a very large impact on performance, and can throw measurements off a lot. – Catfish_Man Mar 20 '16 at 18:25
5
import Foundation

let input:String = "010101011001010101001010101100101010100101010110010101010101011001010101001010101100101010100101010101011001010101001010101100101010100101010"
var start  = clock()
var output = Array<Bool>(count: input.nulTerminatedUTF8.count, repeatedValue: false)
var index = 0
for val in input.nulTerminatedUTF8 {
    if val != 49 {
        output[index] = true
    }
    index+=1
}
var diff = clock() - start;
var msec = diff * 1000 / UInt(CLOCKS_PER_SEC);
print("Time taken \(Double(msec)/1000.0) seconds \(msec%1000) milliseconds");

This should be really fast. Try it out. For 010101011010101001000100000011010101010101010101 it takes 0.039 secs.

Pradeep K
  • 3,671
  • 1
  • 11
  • 15
  • <3 you. This approach took my parsing from 2.2 seconds to 0.22 seconds. I was looking for an order of magnitude improvement, and you delivered! – Alex Wayne Mar 18 '16 at 18:00
  • I saw a sightly better performance on large data set if you change `repeatedValue:true` and then `if val == 49 { output[index] = false }`. May be == is faster than !=. You can try it with your dataset and see if you get even better performance. – Pradeep K Mar 18 '16 at 18:23
  • @AlexWayne, place your data on gist. I will test performance. Tested Pradeep's vs mine solutions – result: 2.546 vs 0.826. – dimpiax Mar 18 '16 at 19:58
  • I agree. @dimpiax's is 3 times faster. – Pradeep K Mar 19 '16 at 03:21
  • @dimpiax I did some benchmarking a playground, and pradeeps solutions seems about _40x faster_ when operating on 1024 characters. My test playground: https://gist.github.com/Squeegy/328a4ca72e0db22793ed – Alex Wayne Mar 20 '16 at 00:13
  • @dimpiax I think I got you and another poster confused. But you're both right. `nulTerminatedUTF8` and comparing the byte value was the key here. – Alex Wayne Mar 20 '16 at 00:22
  • @AlexWayne, there is no mine decision. – dimpiax Mar 20 '16 at 00:23
  • @AlexWayne, really `nulTerminatedUTF8` gives depends performance. If you test hard, you can see that results vice versa. – dimpiax Mar 20 '16 at 00:35
1

This should be a little faster than the enumerate() where char == "1" version (0.557s for 500_000 alternating ones and zeros vs. 1.159s algorithm 'A' from diampiax)

let input = inputStr.utf8
let n = input.count
var output = [Bool](count: n, repeatedValue: false)
let one = UInt8(49) // 1
for (idx, char) in input.enumerate() {
    if char == one { output[idx] = true }
}

but it's also a lot less readable ;-p

edit: both versions are slower than the map variant, maybe you forgot to compile with optimizations?

nemesit
  • 449
  • 3
  • 7
1

I would guess that this is as fast as possible:

let targ = Character("1")
let input: String = "001" // your real string goes here
let inputchars = Array(input.characters)
var output:[Bool] = Array.init(count: inputchars.count, repeatedValue: false)
inputchars.withUnsafeBufferPointer {
    inputbuf in
    output.withUnsafeMutableBufferPointer {
        outputbuf in
        var ptr1 = inputbuf.baseAddress
        var ptr2 = outputbuf.baseAddress
        for _ in 0..<inputbuf.count {
            ptr2.memory = ptr1.memory == targ
            ptr1 = ptr1.successor()
            ptr2 = ptr2.successor()
        }
    }
}
// output now contains the result

The reason is that, thanks to the use of buffer pointers, we are simply cycling through contiguous memory, just like the way you cycle through a C array by incrementing its pointer. Thus, once we get past the initial setup, this should be as fast as it would be in C.

EDIT In an actual test, the time difference between the OP's original method and this one is the difference between

13.3660290241241

and

0.219357967376709

which is a pretty dramatic speed-up. I hasten to add, however, that I have excluded the initial set-up from the timing test. This line:

let inputchars = Array(input.characters)

...is particularly expensive.

matt
  • 515,959
  • 87
  • 875
  • 1,141
0

One more step should speed that up even more. Using reserveCapacity will resize the array once before the loops starts instead of trying to do it as the loop runs.

var output = [Bool]()
output.reserveCapacity(input.characters.count)
for char in input.characters {
    output.append(char == "1")
}
Dion Larson
  • 868
  • 8
  • 14
  • Strangely, this has the opposite effect. Adding the `reserve capacity line makes it about 25% slower. :( – Alex Wayne Mar 18 '16 at 17:51
  • That's interesting. I wonder if it leads to the same slowdown with -Ofast enabled. It would be nice to know what size an array needs to be before `reserveCapacity` actually leads to a speed increase. – Dion Larson Mar 18 '16 at 17:57
0

Use withCString(_:) to retrieve a raw UnsafePointer<Int8>. Iterate over that and compare to 49 (ascii value of "1").

Léo Natan
  • 56,823
  • 9
  • 150
  • 195
0

What about a more functional style? It's not fastest (47 ms), today, for sure...

import Cocoa

let start = clock()

let bools = [Bool](([Character] ("010101011001010101001010101100101010100101010110010101010101011001010101001010101100101010100101010101011001010101001010101100101010100101010".characters)).map({$0 == "1"}))

let msec = (clock() - start) * 1000 / UInt(CLOCKS_PER_SEC);
print("Time taken \(Double(msec)/1000.0) seconds \(msec%1000) milliseconds");
Jaypee
  • 1
0

I need to some testing to be sure but I think one issue with many approaches given including the original map is that they need to iterate over the string to count the characters and then a second time to actually process the characters.

Have you tried:

let output = [Bool](input.characters.lazy.map { $0 == "1" })

This might only do a single iteration.

The other thing that could speed things up is if you can avoid using strings but instead use arrays of characters of an appropriate encoding (particularly if is more fixed size units (e.g. UTF16 or ASCII). Then then length lookup will be O(1) rather than O(n) and the iteration may be faster too

BTW always test performance with the optimiser enabled and never in the Playground because the performance characteristics are completely different, sometimes by a factor of 100.

Joseph Lord
  • 6,446
  • 1
  • 28
  • 32