3

According to man 3 memccpy the memccpy function is defined as follows:

SYNOPSIS

#include <string.h>

void *memccpy(void *dest, const void *src, int c, size_t n);

DESCRIPTION

The memccpy() function copies no more than n bytes from memory area src to memory area dest, stopping when the character c is found.

If the memory areas overlap, the results are undefined.

What confuses me is that memccpy copies n bytes and stops if character c is found. However, the function takes int c as an argument. So what happens if I call memccpy with the following value:

memccpy(&x, &y, 0xffffff76, 100);

Here the value to check is too big for char. Should this case work?

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
stht55
  • 390
  • 1
  • 8
  • 3
    Does https://stackoverflow.com/questions/5919735/why-does-memset-take-an-int-instead-of-a-char answer your question? – KamilCuk May 18 '22 at 22:10
  • @KamilCuk no I don't see an explanation on how exactly this case is handled in code. – stht55 May 18 '22 at 22:12
  • 4
    Some manuals mention that `c` is *converted* to `unsigned char`. So it is basically taking the value of `(unsigned char)c` - https://en.cppreference.com/w/c/string/byte/memccpy ,https://pubs.opengroup.org/onlinepubs/9699919799/functions/memccpy.html – Eugene Sh. May 18 '22 at 22:12
  • At least one implementation (i.e., GNU) of `memccpy` calls `memchr` with the value passed. So, you'd want to ask "what does `memchr` do with the value you want to test". – Jeff Holt May 18 '22 at 22:14

3 Answers3

6

memccpy() is defined by POSIX.1-2001 (IEEE Std 1003.1-2001), which states:

SYNOPSIS

#include <string.h>

void *memccpy(void *restrict s1, const void *restrict s2,
       int c, size_t n);

DESCRIPTION

The memccpy() function shall copy bytes from memory area s2 into s1, stopping after the first occurrence of byte c (converted to an unsigned char) is copied, or after n bytes are copied, whichever comes first. If copying takes place between objects that overlap, the behavior is undefined.

So there you go, a simple unsigned char conversion takes place:

void *memccpy(void *restrict s1, const void *restrict s2, int c, size_t n) {
    unsigned char actual_c = (unsigned char)c;
    // ...
}

In fact, the most prominent C standard library implementations that I know do exactly this:

  • GNU libc: passed to memchr which does unsigned char c = (unsigned int)c_in;
  • BSD libc: unsigned char uc = c;
  • Bionic (Android): borrowed from BSD unsigned char uc = c;
  • musl libc: reassignment with cast c = (unsigned char)c;
  • uClibc: cast at comparison: (((unsigned char)(*r1++ = *r2++)) != ((unsigned char) c))
Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
4

how exactly this case is handled in code

Just the value of the parameter is converted to a character:

void *memccpy(..., int param_c, ...) {
     unsigned char c = param_c;

In real life : https://github.com/lattera/glibc/blob/master/string/memccpy.c#L33 https://github.com/lattera/glibc/blob/master/string/memchr.c#L63 .

(On nowadays systems) unsigned char has 8 bits, (unsigned char)(int)0xffffff76 just becomes 0x76. The upper bits are just ignored.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
1

This is an older function which is similar to memset in terms of the argument it accepts:

void *memset(void *s, int c, size_t n);

It is described in the C standard as follows:

The memset function copies the value of c (converted to an unsigned char) into each of the first n characters of the object pointed to by s.

Both functions date back to at least 4.3 BSD, so it would make sense that they handle their arguments in a similar way.

So given your example, the value 0xffffff76 would be converted to the unsigned char value 0x76, and that would be the value it check for to stop.

dbush
  • 205,898
  • 23
  • 218
  • 273