How does memccpy handle large integer values?

Question

According to man 3 memccpy the memccpy function is defined as follows:

SYNOPSIS
#include <string.h>

void *memccpy(void *dest, const void *src, int c, size_t n);
DESCRIPTION

The memccpy() function copies no more than n bytes from memory area src to memory area dest, stopping when the character c is found.

If the memory areas overlap, the results are undefined.

What confuses me is that memccpy copies n bytes and stops if character c is found. However, the function takes int c as an argument. So what happens if I call memccpy with the following value:

memccpy(&x, &y, 0xffffff76, 100);

Here the value to check is too big for char. Should this case work?

Does https://stackoverflow.com/questions/5919735/why-does-memset-take-an-int-instead-of-a-char answer your question? — KamilCuk, May 18 '22 at 22:10
@KamilCuk no I don't see an explanation on how exactly this case is handled in code. — stht55, May 18 '22 at 22:12
Some manuals mention that `c` is *converted* to `unsigned char`. So it is basically taking the value of `(unsigned char)c` - https://en.cppreference.com/w/c/string/byte/memccpy ,https://pubs.opengroup.org/onlinepubs/9699919799/functions/memccpy.html — Eugene Sh., May 18 '22 at 22:12
At least one implementation (i.e., GNU) of `memccpy` calls `memchr` with the value passed. So, you'd want to ask "what does `memchr` do with the value you want to test". — Jeff Holt, May 18 '22 at 22:14

Marco Bonelli · Accepted Answer · 2022-05-19T04:17:56.230

memccpy() is defined by POSIX.1-2001 (IEEE Std 1003.1-2001), which states:

SYNOPSIS
#include <string.h>

void *memccpy(void *restrict s1, const void *restrict s2,
       int c, size_t n);
DESCRIPTION

The memccpy() function shall copy bytes from memory area s2 into s1, stopping after the first occurrence of byte c (converted to an unsigned char) is copied, or after n bytes are copied, whichever comes first. If copying takes place between objects that overlap, the behavior is undefined.

So there you go, a simple unsigned char conversion takes place:

void *memccpy(void *restrict s1, const void *restrict s2, int c, size_t n) {
    unsigned char actual_c = (unsigned char)c;
    // ...
}

In fact, the most prominent C standard library implementations that I know do exactly this:

GNU libc: passed to memchr which does unsigned char c = (unsigned int)c_in;
BSD libc: unsigned char uc = c;
Bionic (Android): borrowed from BSD unsigned char uc = c;
musl libc: reassignment with cast c = (unsigned char)c;
uClibc: cast at comparison: (((unsigned char)(*r1++ = *r2++)) != ((unsigned char) c))

score 4 · Answer 2 · answered May 18 '22 at 22:14

how exactly this case is handled in code

Just the value of the parameter is converted to a character:

void *memccpy(..., int param_c, ...) {
     unsigned char c = param_c;

In real life : https://github.com/lattera/glibc/blob/master/string/memccpy.c#L33 https://github.com/lattera/glibc/blob/master/string/memchr.c#L63 .

(On nowadays systems) unsigned char has 8 bits, (unsigned char)(int)0xffffff76 just becomes 0x76. The upper bits are just ignored.

score 1 · Answer 3 · answered May 18 '22 at 22:17

This is an older function which is similar to memset in terms of the argument it accepts:

void *memset(void *s, int c, size_t n);

It is described in the C standard as follows:

The memset function copies the value of c (converted to an unsigned char) into each of the first n characters of the object pointed to by s.

Both functions date back to at least 4.3 BSD, so it would make sense that they handle their arguments in a similar way.

So given your example, the value 0xffffff76 would be converted to the unsigned char value 0x76, and that would be the value it check for to stop.

How does memccpy handle large integer values?

3 Answers3