13

I found the following code compiles and works:

func foo(p:UnsafePointer<UInt8>) {
    var p = p
    for p; p.memory != 0; p++ {
        print(String(format:"%2X", p.memory))
    }
}

let str:String = "今日"
foo(str)

This prints E4BB8AE697A5 and that is a valid UTF8 representation of 今日

As far as I know, this is undocumented behavior. from the document:

When a function is declared as taking a UnsafePointer argument, it can accept any of the following:

  • nil, which is passed as a null pointer
  • An UnsafePointer, UnsafeMutablePointer, or AutoreleasingUnsafeMutablePointer value, which is converted to UnsafePointer if necessary
  • An in-out expression whose operand is an lvalue of type Type, which is passed as the address of the lvalue
  • A [Type] value, which is passed as a pointer to the start of the array, and lifetime-extended for the duration of the call

In this case, str is non of them.

Am I missing something?


ADDED:

And it doesn't work if the parameter type is UnsafePointer<UInt16>

func foo(p:UnsafePointer<UInt16>) {
    var p = p
    for p; p.memory != 0; p++ {
        print(String(format:"%4X", p.memory))
    }
}
let str:String = "今日"
foo(str)
//  ^ 'String' is not convertible to 'UnsafePointer<UInt16>'

Even though the internal String representation is UTF16

let str = "今日"
var p = UnsafePointer<UInt16>(str._core._baseAddress)
for p; p.memory != 0; p++ {
    print(String(format:"%4X", p.memory)) // prints 4ECA65E5 which is UTF16 今日
}
rintaro
  • 51,423
  • 14
  • 131
  • 139
  • It seems to be it is the last one, no? – Mundi Nov 21 '14 at 14:30
  • I think, no. `String` is not `Array` – rintaro Nov 21 '14 at 14:32
  • I meant to say the penultimate one. It is just like a in-out variable. Maybe the wording **"which is passed"** is not clear. It could mean "this is how the function will interpret this argument" (which I think is meant) or "this is what you have to pass in", (which I think is not meant here). – Mundi Nov 21 '14 at 14:35
  • it is now documented "A String value, if Type is Int8 or UInt8. The string will automatically be converted to UTF8 in a buffer, and a pointer to that buffer is passed to the function" https://developer.apple.com/library/content/documentation/Swift/Conceptual/BuildingCocoaApps/InteractingWithCAPIs.html#//apple_ref/doc/uid/TP40014216-CH8-XID_15 – Manish Nahar May 04 '17 at 10:20

1 Answers1

10

This is working because of one of the interoperability changes the Swift team has made since the initial launch - you're right that it looks like it hasn't made it into the documentation yet. String works where an UnsafePointer<UInt8> is required so that you can call C functions that expect a const char * parameter without a lot of extra work.

Look at the C function strlen, defined in "shims.h":

size_t strlen(const char *s);

In Swift it comes through as this:

func strlen(s: UnsafePointer<Int8>) -> UInt

Which can be called with a String with no additional work:

let str = "Hi."
strlen(str)
// 3

Look at the revisions on this answer to see how C-string interop has changed over time: https://stackoverflow.com/a/24438698/59541

Community
  • 1
  • 1
Nate Cook
  • 92,417
  • 32
  • 217
  • 178
  • Thanks! nice. According to `swiftc -emit-sil` outputs, it actually creates temporarily `Array` from `String.UTF8View.Generator`. It looks *not* so fast... – rintaro Nov 21 '14 at 18:29
  • Huh. Well, the "I" in SIL stands for intermediate, right? Depending on how strings are actually implemented in the compiled runtime (what if they're just `char*` under the hood?), that might be a no-op. – Nate Cook Nov 21 '14 at 18:33
  • @rintaro: As of Swift 5 the underlying character storage is (null-terminated) UTF-8. One reason for the change was to make passing Swift strings to C functions efficient. – Martin R May 04 '20 at 10:04