After some extensive investigation, I believe I've discovered the most efficient way of getting a []byte
from a string
as of Go 1.17 (this is for i386/x86_64 gc
; I haven't tested other architectures.) The trade-off of being efficient code here is being inefficient to code, though.
Before I say anything else, it should be made clear that the differences are ultimately very small and probably inconsequential -- the info below is for fun/educational purposes only.
Summary
With some minor alterations, the accepted answer illustrating the technique of slicing a pointer to array is the most efficient way. That being said, I wouldn't be surprised if unsafe.Slice
becomes the (decisively) better choice in the future.
unsafe.Slice
unsafe.Slice
currently has the advantage of being slightly more readable, but I'm skeptical about it's performance. It looks like it makes a call to runtime.unsafeslice
. The following is the gc amd64 1.17 assembly of the function provided in Atamiri's answer (FUNCDATA
omitted). Note the stack check (lack of NOSPLIT
):
unsafeGetBytes_pc0:
TEXT "".unsafeGetBytes(SB), ABIInternal, $48-16
CMPQ SP, 16(R14)
PCDATA $0, $-2
JLS unsafeGetBytes_pc86
PCDATA $0, $-1
SUBQ $48, SP
MOVQ BP, 40(SP)
LEAQ 40(SP), BP
PCDATA $0, $-2
MOVQ BX, ""..autotmp_4+24(SP)
MOVQ AX, "".s+56(SP)
MOVQ BX, "".s+64(SP)
MOVQ "".s+56(SP), DX
PCDATA $0, $-1
MOVQ DX, ""..autotmp_5+32(SP)
LEAQ type.uint8(SB), AX
MOVQ BX, CX
MOVQ DX, BX
PCDATA $1, $1
CALL runtime.unsafeslice(SB)
MOVQ ""..autotmp_5+32(SP), AX
MOVQ ""..autotmp_4+24(SP), BX
MOVQ BX, CX
MOVQ 40(SP), BP
ADDQ $48, SP
RET
unsafeGetBytes_pc86:
NOP
PCDATA $1, $-1
PCDATA $0, $-2
MOVQ AX, 8(SP)
MOVQ BX, 16(SP)
CALL runtime.morestack_noctxt(SB)
MOVQ 8(SP), AX
MOVQ 16(SP), BX
PCDATA $0, $-1
JMP unsafeGetBytes_pc0
Other unimportant fun facts about the above (easily subject to change): compiled size of 3326
B; has an inline cost of 7
; correct escape analysis: s leaks to ~r1 with derefs=0
.
Carefully Modifying *reflect.SliceHeader
This method has the advantage/disadvantage of letting one modify the internal state of a slice directly. Unfortunately, due it's multiline nature and use of uintptr, the GC can easily mess things up if one is not careful about keeping a reference to the original string. (Here I avoided creating temporary pointers to reduce inline cost and to avoid needing to add runtime.KeepAlive
):
func unsafeGetBytes(s string) (b []byte) {
(*reflect.SliceHeader)(unsafe.Pointer(&b)).Data = (*reflect.StringHeader)(unsafe.Pointer(&s)).Data
(*reflect.SliceHeader)(unsafe.Pointer(&b)).Cap = len(s)
(*reflect.SliceHeader)(unsafe.Pointer(&b)).Len = len(s)
return
}
The corresponding assembly on amd64 (FUNCDATA
omitted):
TEXT "".unsafeGetBytes(SB), NOSPLIT|ABIInternal, $32-16
SUBQ $32, SP
MOVQ BP, 24(SP)
LEAQ 24(SP), BP
MOVQ AX, "".s+40(SP)
MOVQ BX, "".s+48(SP)
MOVQ $0, "".b(SP)
MOVUPS X15, "".b+8(SP)
MOVQ "".s+40(SP), DX
MOVQ DX, "".b(SP)
MOVQ "".s+48(SP), CX
MOVQ CX, "".b+16(SP)
MOVQ "".s+48(SP), BX
MOVQ BX, "".b+8(SP)
MOVQ "".b(SP), AX
MOVQ 24(SP), BP
ADDQ $32, SP
RET
Other unimportant fun facts about the above (easily subject to change): compiled size of 3700
B; has an inline cost of 20
; subpar escape analysis: s leaks to {heap} with derefs=0
.
Unsafer version of modifying SliceHeader
Adapted from Nuno Cruces' answer. This relies on the inherent structural similarity between StringHeader
and SliceHeader
, so in a sense it breaks "more easily". Additionally, it temporarily creates an illegal state where cap(b)
(being 0
) is less than len(b)
.
func unsafeGetBytes(s string) (b []byte) {
*(*string)(unsafe.Pointer(&b)) = s
(*reflect.SliceHeader)(unsafe.Pointer(&b)).Cap = len(s)
return
}
Corresponding assembly (FUNCDATA
omitted):
TEXT "".unsafeGetBytes(SB), NOSPLIT|ABIInternal, $32-16
SUBQ $32, SP
MOVQ BP, 24(SP)
LEAQ 24(SP), BP
MOVQ AX, "".s+40(FP)
MOVQ $0, "".b(SP)
MOVUPS X15, "".b+8(SP)
MOVQ AX, "".b(SP)
MOVQ BX, "".b+8(SP)
MOVQ BX, "".b+16(SP)
MOVQ "".b(SP), AX
MOVQ BX, CX
MOVQ 24(SP), BP
ADDQ $32, SP
NOP
RET
Other unimportant details: compiled size 3636
B, inline cost of 11
, with subpar escape analysis: s leaks to {heap} with derefs=0
.
Slicing a pointer to array
This is the accepted answer (shown here for comparison) -- its primary disadvantage is its ugliness (viz. magic number 0x7fff0000
). There's also the tiniest possibility of getting a string bigger than the array, and an unavoidable bounds check.
func unsafeGetBytes(s string) []byte {
return (*[0x7fff0000]byte)(unsafe.Pointer(
(*reflect.StringHeader)(unsafe.Pointer(&s)).Data),
)[:len(s):len(s)]
}
Corresponding assembly (FUNCDATA
removed).
TEXT "".unsafeGetBytes(SB), NOSPLIT|ABIInternal, $24-16
SUBQ $24, SP
MOVQ BP, 16(SP)
LEAQ 16(SP), BP
PCDATA $0, $-2
MOVQ AX, "".s+32(SP)
MOVQ BX, "".s+40(SP)
MOVQ "".s+32(SP), AX
PCDATA $0, $-1
TESTB AL, (AX)
NOP
CMPQ BX, $2147418112
JHI unsafeGetBytes_pc54
MOVQ BX, CX
MOVQ 16(SP), BP
ADDQ $24, SP
RET
unsafeGetBytes_pc54:
MOVQ BX, DX
MOVL $2147418112, BX
PCDATA $1, $1
NOP
CALL runtime.panicSlice3Alen(SB)
XCHGL AX, AX
Other unimportant details: compiled size 3142
B, inline cost of 9
, with correct escape analysis: s leaks to ~r1 with derefs=0
Note the runtime.panicSlice3Alen
-- this is bounds check that checks that len(s)
is within 0x7fff0000
.
Improved slicing pointer to array
This is what I've concluded to be the most efficient method as of Go 1.17. I basically modified the accepted answer to eliminate the bounds check, and found a "more meaningful" constant (math.MaxInt32
) to use than 0x7fff0000
. Using MaxInt32
preserves 32-bit compatibility.
func unsafeGetBytes(s string) []byte {
const MaxInt32 = 1<<31 - 1
return (*[MaxInt32]byte)(unsafe.Pointer((*reflect.StringHeader)(
unsafe.Pointer(&s)).Data))[:len(s)&MaxInt32:len(s)&MaxInt32]
}
Corresponding assembly (FUNCDATA
removed):
TEXT "".unsafeGetBytes(SB), NOSPLIT|ABIInternal, $0-16
PCDATA $0, $-2
MOVQ AX, "".s+8(SP)
MOVQ BX, "".s+16(SP)
MOVQ "".s+8(SP), AX
PCDATA $0, $-1
TESTB AL, (AX)
ANDQ $2147483647, BX
MOVQ BX, CX
RET
Other unimportant details: compiled size 3188
B, inline cost of 13
, and correct escape analysis: s leaks to ~r1 with derefs=0