2

I have an array with many millions of elements (7201 x 7201 data points) where I am converting the data to a greyscale image.

var imagePixels = heights.map { terrainToColorScale($0.height, gradient: .Linear) }
let _ = writeImageToFile(imagePixels, height: latStrides, width: lonStrides, imgColorSpace: .BW, to: imageURL, as: .png)

This snippet of code takes about 11 seconds to complete (CPU=2.3Ghz 8-Core i9), but I would like to get better performance if possible. Code is currently running in a single thread.

Would simply splitting my heights array into chunks (say 100 chunks) and running a TaskGroup with a task for each chunk get a decent improvement? Or am I looking at getting into Metal and shaders (I know zero about Metal!!) to get a better result?

Just for interest, the typical image generated is... (Image downsampled as would be too big for upload here.) Greyscale image of SRTM 1 arc-sec data


Update: Adding code associated with terrainToColorScale Basically, for a linear conversion it will take the terrain height (typically 0...9000) and scale it to return a value between 0...255 I also have non-linear implementations (not shown below) that will show greater detail for datasets that are mostly low/high terrain elevations.

let terrainLowerCutOff: Double = 0.0     // Mean Sea Level
let terrainUpperCutOff: Double = 9000.0  // Value in meters, just higher that Everest

func terrainToColorScale(_ elev: Double, lowerCutOff: Double = terrainLowerCutOff, upperCutOff: Double = terrainUpperCutOff, gradient: ImageColorGradient = .Linear) -> UInt8 {
  switch gradient {
  case .Linear:
    return linearColorScale(elev, lowerCutOff: lowerCutOff, upperCutOff: upperCutOff)
  case .LinearInverse:
    return linearInverseColorScale(elev, lowerCutOff: lowerCutOff, upperCutOff: upperCutOff)
  case .CurveEmphasiseLows:
    return reciprocalPowerColorScale(elev, lowerCutOff: lowerCutOff, upperCutOff: upperCutOff)
  case .CurveEmphasiseLowsInverse:
    return reciprocalInversePowerColorScale(elev, lowerCutOff: lowerCutOff, upperCutOff: upperCutOff)
  case .CurveEmphasiseHighs:
    return powerColorScale(elev, lowerCutOff: lowerCutOff, upperCutOff: upperCutOff)
  case .CurveEmphasiseHighsInverse:
    return powerInverseColorScale(elev, lowerCutOff: lowerCutOff, upperCutOff: upperCutOff)
  }
}

fileprivate func linearColorScale(_ value: Double, lowerCutOff: Double, upperCutOff: Double) -> UInt8 {
    return UInt8( 255 * normaliseColorScale(value, lowerCutOff: lowerCutOff, upperCutOff: upperCutOff) )
}

fileprivate func normaliseColorScale(_ value: Double, lowerCutOff: Double, upperCutOff: Double) -> Double {
  switch value {
  case _ where value <= lowerCutOff :
    return 0.0
  case _ where value >= upperCutOff :
    return 1.0
  default :
    return (value - lowerCutOff) / (upperCutOff - lowerCutOff)
  }
}
Jeshua Lacock
  • 5,730
  • 1
  • 28
  • 58
KieranC
  • 57
  • 6
  • You should look into using `CoreImage` filters for this. – EmilioPelaez Jan 26 '22 at 13:28
  • 1
    You need look into the accelerate framework: https://developer.apple.com/documentation/accelerate – Cy-4AH Jan 26 '22 at 14:17
  • It would be helpful to see the implementation of terrainToColorScale. I assume you've used Instruments to highlight the performance hotspots? And second vote here for Accelerate, but it depends what you're doing in that function. – jrturton Jan 26 '22 at 14:39
  • 1
    Thanks, I'll take a look. But it is not the manipulation or handling of the image and saving to file that is taking the time here. For these two lines of code, 99.9% of the time is running ``.map``` on the array. The ```heights``` is an array is of a custom data type, using ```.map``` to convert one property of the datatype to an array of ```UInt8``` which then is my bitmap data for the image. – KieranC Jan 26 '22 at 14:53
  • @jrturton Code snippet added to post. :) – KieranC Jan 26 '22 at 15:13

1 Answers1

1

This is not a complete answer to your question, but I think it should give you a start on where to go. vDSP is part of Accelerate, and it's built to speed up mathematical operations on arrays. This code uses multiple steps, so probably could be more optimised, and it's not taking any other filters than linear into account, but I don't have enough knowledge to make the steps more effective. However, on my machine, vDSP is 4x faster than map for the following processing:

import Foundation
import Accelerate

let count = 7200 * 7200
var date = Date()
print("Generating")
let test: [CGPoint] = (0..<count).map {
    CGPoint(x: $0, y: Int.random(in: -2000...10000))
}
print("Generating took \(Date().timeIntervalSince(date))")
date = Date()
print("Mapping")
let heights: [Float] = test.map { Float($0.y) }
print("Mapping took \(Date().timeIntervalSince(date))")
date = Date()
print("Converting via vDSP")
let clipped = vDSP.clip(heights, to: 0...9000)
let scaled = vDSP.divide(clipped, 9000)
let multiplied = vDSP.multiply(255, scaled)
let integers = vDSP.floatingPointToInteger(multiplied, integerType: UInt8.self, rounding: .towardNearestInteger)
print("Converting via DSP took \(Date().timeIntervalSince(date))")
date = Date()

print("Converting via map")
let mappedIntegers = heights.map { height -> UInt8 in
    let clipped: Float
    if height < 0 {
        clipped = 0
    } else if height > 9000 {
        clipped = 9000
    } else {
        clipped = height
    }
    let scaled = clipped / 9000
    let multiplied = scaled * 255
    return UInt8(multiplied.rounded())
}
print("Converting via map took \(Date().timeIntervalSince(date))")
jrturton
  • 118,105
  • 32
  • 252
  • 268
  • 1
    Thank you. I can certainly see how to make that work for my code. FYI, the non-linear filters are simply taking the scaled value and raising to a power (e.g. ```pow(scaled, 2)```, then multiply by 255 as before. Not sure what the syntax in vDSP is but that's something for me to find out. ;) – KieranC Jan 26 '22 at 18:05
  • 1
    vForce is the library for transcendental operations such as raising to the power. `vvpowsf` raises each element in an array to a scalar exponent. May I suggest that if you implement @jrturton's code above that you use the function variants with the `result` parameter. That way your code won't allocate new space for each operation. These operations work in-place: you can use the source as the destination. – Flex Monkey May 02 '22 at 16:32