0

Newbie to Objective-C and Swift here. I'm creating an NSArray from a C float array with the following code:

float* output_f = output.data_ptr<float>();
NSMutableArray *results = [NSMutableArray arrayWithCapacity: 1360*1060]
for (int i = 0; i < 1360 * 1016; i++) {
    [results insertObject:@(output_f[i]) atIndex:i];
}

However since there are over a million samples to be inserted this is slow and is becoming a bottle-neck in my application. Is there a quicker way to create an NSArray from a C array without copying the elements one-by-one?

Willeke
  • 14,578
  • 4
  • 19
  • 47
Karnik Ram
  • 33
  • 6
  • 2
    That sounds like an XY problem. Why do you want to copy a content of a C array into an `NSArray`? – The Dreams Wind Sep 29 '22 at 21:25
  • I'm running a PyTorch model which is wrapped as an Objective C module. I want to use this model's output values in Swift. When I directly return its output float array to Swift as an UnsafeBufferPointer and copy it to an Array object, strangely all its values turn into zero. I was able to preserve its values only by copying it into an NSMutable array. However it's too slow – Karnik Ram Sep 29 '22 at 21:50
  • I learned about this way of passing the model's values (a C array) through an NSMutable array from this example [here](https://github.com/pytorch/ios-demo-app/blob/master/HelloWorld/HelloWorld/HelloWorld/TorchBridge/TorchModule.mm) – Karnik Ram Sep 29 '22 at 21:52
  • 2
    So the question in realty is "How do i pass a C-array to Swift?" – The Dreams Wind Sep 29 '22 at 22:21
  • Since `output_f` is a `float *` rather than `float [size]` just making it available in Swift, along with the number of floats in it, will let you do this: `UnsafeBufferPointer(start: output_f, count: num_floats)`. Since `UnsafeBufferPointer conforms to `RandomAccessCollection` with `Int` indices, you can use it pretty much just like an `Array`. This doesn't make a copy. It makes a swift-side pointer alias directly to the C array. – Chip Jarred Sep 29 '22 at 23:53
  • Actually on the Swift side, `output_f` will be `UnsafeMutablePointer?`. You can either create an `UnsafePointer` from it, `let ptr = UnsafePointer(myMutablePtr)` to initialize the `UnsafeBufferPointer`, or you could just create an `UnsafeMutableBufferPointer` instead, which would let you modify the data... whether that's a good thing to do or not depends on whether anything on the C-side continues to depend on it, once it's thrown over the wall to Swift. – Chip Jarred Sep 30 '22 at 00:00
  • And if you want it in a Swift `Array` you can, just construct it: `let floatArray = [CFloat](myBufferPtr)`. That will give you an actual copy. – Chip Jarred Sep 30 '22 at 00:02
  • Thank you for your comments! This is what I had tried originally (creating an UnSafeMutableBufferPointer from the returned UnsafeMutablePointer, and constructing an Array using it) but when this happens all the values in the array become zero. Interestingly when I tried this out again now I noticed the BufferPointer gives an EXEC_BAD_ACCESS error if I try to read into it directly insteaad of constructing an Array... Seems like the original memory is being deallocated somewhere but not sure why.. – Karnik Ram Sep 30 '22 at 00:17
  • The same thing happens even if I copy the returned `UnsafeMutablePointer?` into an `UnsafePointer` – Karnik Ram Sep 30 '22 at 00:18
  • 1
    I'd have to know more about the context, and exactly how `output` is defined. It's `data_ptr()` is undoubtedly just returning a pointer to some internal memory it holds rather than allocating a new block for you. So if `output` were disposed of prior to Swift trying work with it, probably so is it's internal memory to which `output_f` points. I'll update my answer to include this possibility, and how to deal with it. – Chip Jarred Sep 30 '22 at 00:55
  • Yeah, that is probably what is happening -- `output` is being disposed of quickly. This wasn't happening earlier when my data size was small < 100. Thanks! – Karnik Ram Sep 30 '22 at 01:03
  • Since it's actually C++ `output`'s destructor is probably being called. `data_ptr` I think is a member function of `vector`, so yeah, if `output` is going out of scope, then it's memory is being deallocated. If it ever worked with small sizes, it was probably just by accident. – Chip Jarred Sep 30 '22 at 01:24
  • I. updated my answer to cover how to handle this case, at least in the case where Swift calls the C to get the data, rather than C calling Swift. – Chip Jarred Sep 30 '22 at 01:24
  • Yeah, I am really curious why it was working before.. – Karnik Ram Sep 30 '22 at 01:51

2 Answers2

2

There's no need to go through Obj-C. Assuming that output_f appears in an include file that's included via your bridging header, Swift will see its type as UnsafeMutablePointer<CFloat> (CFloat is just a typealias for Float, named to clarify that it corresponds to the C type).

Assuming you also make the number of floats in the array available, lets say included somewhere in your bridged header files is:

extern float* output_f;
extern int output_f_count;

Then on the Swift-side, you can use them like this:

let outputFloats = UnsafeMutableBufferPointer<CFloat>(
    start: output_f, 
    count: Int(output_f_count))

The cast of output_f_count to Int is necessary because Swift interprets C's int as CInt (aka Int32).

You can use UnsafeMutablePointer much like array, but there's no copying. It just aliases the C data in Swift.

If you want to make sure you don't mutate the data, you can create an UnsafeBufferPointer instead, but you'll need to cast the pointer.

let outputFloats = UnsafeBufferPointer<CFloat>(
    start: UnsafePointer(output_f), 
    count: Int(output_f_count))

Since there's no copying, both of those options are very fast. However, they are pointers. If Swift modifies the contents, the C code will see the changed data, and vice-versa. That may or may not be a good thing, depending on your use case, but you definitely want to be aware of it.

If you want to make a copy, you can make a Swift Array very easily like this:

let outputFloatsArray = [CFloat](outputFloats)

Now you have you Swift-side copy in an Array.

As a very closely related thing, if in a C header, output_f were declared as an actual array like this,

extern float output_f[1360*1060];

Then Swift doesn't see a pointer. It sees, believe it or not, a tuple... a great big ugly tuple with a crap-load of CFloat members, which has the benefit of being a value type, but is hard to work with directly because you can't index into it. Fortunately you can work around that:

withUnsafeBytes(of: output_f) 
{
    let outputFloats = $0.bindMemory(to: CFloat.self)

    // Now within the scope of this closure you can use outputFloats
    // just as before.
}
  • Note: You can also use the pointer directly without going through the buffer pointer types, and because you avoid bounds-checking that way, it is a tiny bit faster, but just a very tiny bit, it's more awkward, and well... you lose the error catching benefits of bounds-checking. Plus the buffer pointer types provide all the RandomAccessCollection methods like map, filter, forEach, etc...

Update:

In comments OP said that he had tried this approach but got EXEC_BAD_ACCESS while dereferencing them. Missing is the context of what is happening between obtaining the pointer from output and its being available to Swift.

Given the clue from earlier that it's actually C++, I think output is probably std::vector<float>, and its probably going out of scope before Swift does anything with the pointers, so its destructor is being called, which of course, deletes its internal data pointer. In that case Swift is accessing memory that is no longer valid.

There are two ways to address this. The first is to make sure that output is not cleaned up until after Swift is done with it's data. The other option, is to copy the data in C.

const int capacity = 1360*1060;
float* p = output.data_ptr<float>();

// static_cast because the above template syntax indicates 
// this is actually C++, not C.
float* output_f = static_cast<float*>(calloc(capacity, sizeof(float)));
memcpy(output_f, p, capacity * sizeof(float));

Now output can be cleaned up before Swift accesses output_f. Also this makes the copy that was originally asked about much faster that using NSArray. Assuming the C code doesn't use output_f after this, Swift can just take ownership of it. In that case, Swift needs to be sure to call free(outout_f) when it's done.

If the Swift code doesn't care about it being in an actual array, the Unsafe...BufferPointer types will do the job.

However, if an actual Array is desired, this will be yet another copy, and copying the same data twice just to get it in a Swift Array doesn't make sense if it can be avoided. How to avoid it depends on whether C (or Obj-C) is calling Swift, or Swift is calling Obj-C. I'm going to assume that it's Swift calling C. So let's assume that Swift is calling some C function get_floats() defined like this:

extern "C" *float get_floats()
{
    const int capacity = 1360*1060;
    float* p = output.data_ptr<float>();

    // static_cast because the above template syntax indicates 
    // this is actually C++, not C.
    float* output_f = static_cast<float*>(
        calloc(capacity, sizeof(float))
    );
    memcpy(output_f, p, capacity * sizeof(float));

    // Maybe do other work including disposing of `output`

    return output_f;
}

You want to change the interface so that a pre-allocated pointer is provided as a parameter, along with its capacity.

extern "C" void get_floats(float *output_f, int capacity)
{
    float* p = output.data_ptr<float>();

    memcpy(output_f, p, capacity * sizeof(float));

    // Maybe do other work including disposing of `output`

    // can use return for something else now -- maybe error code?
}

On the Swift side, you could allocate pointers, but since you want it in an Array anyway:

var outputFloats = [Array](repeating: 0, count: 1360*1060)

outputFloats.withUnsafeMutableBuffer {
    get_floats($0.baseAddress, CInt($0.count))
}

// Now the array is populated with the contents of the C array.

One last thing. The above code makes an assumption that output.data_ptr() points to at least capacity number of floats. Are you sure this is true? Assuming output is std::vector, it would be better to change the memcpy call to:

    const size_t floatsToCopy = std::min(capacity, output.size())
    memcpy(output_f, p, floatsToCopy * sizeof(float));

That ensures that you're not reading garbage from the end of real data if it's actually less than capacity. Then you can return floatsToCopy; from get_floats.

Then on the Swift side, it looks like this:

var outputFloats = [Array](repeating: 0, count: 1360*1060)

let floatsCopied = outputFloats.withUnsafeMutableBuffer {
    get_floats($0.baseAddress, CInt($0.count))
}

outputFloats.removeLast(
    outputFloats.count - Int(floatsCopied), 
    keepingCapacity: true)

You don't actually have to use the keepingCapacity parameter, but doing so allows you to re-use the array without having to pay for more memory allocations. Just refill out to full capacity before calling get_floats again with the same array. Plus unless your peak memory usage is an issue, keepingCapacity: true is likely faster, and at least no worse, than the default, because without it, Array might choose to reallocate to the smaller size, which internally is an allocation, a copy, and a free, and the whole point was to avoid a copy... but the dynamic memory allocation is the really slow part. Given CPU caches and the way instruction pipelines work, you can do a lot of sequential copying in the time it takes to do a single memory allocation.

Chip Jarred
  • 2,600
  • 7
  • 12
  • Thank you so much for your patience, you helped me figure out what the real problem is and finally solve it! Onto other problems now.. – Karnik Ram Sep 30 '22 at 01:54
  • I'm glad my answer and comments were of some help! – Chip Jarred Sep 30 '22 at 02:05
  • @KarnikRam, I just fixed two rather important problems with the code examples in my answer that might affect you. The first in that you need to multiply the number of items being copied by `sizeof(float)` when calling `memcpy`. The other is you should probably use `std::min(capacity, output.count())` as the number of floats to copy, so you don't read past the end of valid data. – Chip Jarred Sep 30 '22 at 02:38
  • I meant `output.size()` not `output.count()`... Switching between C++ and Swift has side-effects in the brain! When I write Metal shaders, I constantly forget semicolons, despite decades of writing C/C++ code before Apple released Swift. I never even gave a thought to semicolons, because my fingers just typed them on autopilot. Not anymore. – Chip Jarred Sep 30 '22 at 02:57
  • 1
    Thank you! The missing multiplication was causing a problem (zero values past an index) that I didn't notice until just now. I'm not sure if I need to do the check for reading past the memory right now, but will keep an eye.. – Karnik Ram Sep 30 '22 at 14:53
1

According to the comments section your final goal is to read C-array data in Swift. Provided you know the length of the array, you can return it from an Objective-C function as a pointer:

- (float *)cArray {
    float *arr = (float *)malloc(sizeof(float) * 4);
    for (int i = 0; i < 4; ++i) {
        arr[i] = i;
    }
    return arr;
}

And just read it from an UnsafePointer in Swift:

let ptr = TDWObject().cArray()

(0 ..< 4).forEach {
    print(ptr.advanced(by: $0).pointee)
}

Don't forget to deallocate the pointer when you are done with it:

ptr.deallocate()
The Dreams Wind
  • 8,416
  • 2
  • 19
  • 49
  • Thank you so much for your help. I tried this out. I returned the raw float * arr from an Obj-C function into Swift where it is loaded as an UnsafeMutablePointer. But when I try to access it using the advanced method you showed or by directly indexing it, I get a EXC_BAD_ACCESS error. I understand this is a memory access error but I am not sure what's causing it in this case – Karnik Ram Sep 29 '22 at 23:47