I find it hard to believe I'm the first person to run into this problem but searched for quite some time and didn't find a solution to this.
I'd like to use strncpy
but have it be UTF8 aware so it doesn't partially write a utf8 code-point into the destination string.
Otherwise you can never be sure that the resulting string is valid UTF8, even if you know the source is (when the source string is larger than the max length).
Validating the resulting string can work but if this is to be called a lot it would be better to have a strncpy function that checks for it.
glib has g_utf8_strncpy
but this copies a certain number of unicode chars, whereas Im looking for a copy function that limits by the byte length.
To be clear, by "utf8 aware", I mean that it should not exceed the limit of the destination buffer and it must never copy only part of a utf-8 code-point. (Given valid utf-8 input must never result in having invalid utf-8 output).
Note:
Some replies have pointed out that strncpy
nulls all bytes and that it wont ensure zero termination, in retrospect I should have asked for a utf8 aware strlcpy
, however at the time I didn't know of the existence of this function.