Is it legal to call memchr with a too-long length, if you know the character will be found before reaching the end of the valid region?

Question

Is the following defined behavior in C11 and C++11¹?

bool has4() {  
    char buf[10] = {0, 1, 2, 4};
    return memchr(buf, 4, 20);
}

Here we pass a too-long length to memchr. The array has 10 elements but we pass 20. The element we are searching for, however, is always found before the end. It is clear to me if this is legal.

If this is allowed, it would limit implementation flexibility, since the implementation cannot rely on the size being a valid indication of the size of the accessible memory region and hence must be careful about reading beyond the found element. An example would be an implementation that wants to do a 16 byte SIMD load starting at the passed-in pointer and then check all the 16 bytes in parallel. If the user passes a length of 16, this would be safe only if the entire length was required to be accessible.

Otherwise (if the above code is legal) the implementation must avoid potentially faulting on elements past the target element, for example by aligning the load (potentially expensive) or checking if the pointer is near the end of a protection boundary.

¹ Here's one of those rare questions where I guess tagging both C and C++ is valid: as far as I can tell the C++ standard just defers directly to the C standard here, via reference, in terms of behavior, but if that's not the case I want to know.

So long as you don't search beyond the end of the range, it's defined. — AndyG, Nov 15 '17 at 19:44
Thanks @AndyG - what do you mean by "you" here? Do you mean "so long as the `memchr` implementation wouldn't be required to search beyond the end of the range"? — BeeOnRope, Nov 15 '17 at 19:45
@AndyG: Do you have a reference that explains exactly why `memchr` isn't allowed to touch `buf[19]` if `buf[3]` contains a match? So you're allowed to pass a length that goes into an unmapped page? `memchr` doesn't work like `memchr(char buf[static count], char needle, size_t count)`? — Peter Cordes, Nov 15 '17 at 19:46
Assuming http://en.cppreference.com/w/c/string/byte/memchr is correct, accessing beyond the actual limits leads to undefined behavior. I.e. there's no issue as long as a result is found before an access occurs past the end of the actual array. — AVH, Nov 15 '17 at 19:46
@PeterCordes: It was easier to put it into an answer. Pulling from n1548 here. cppreference also corroborates this. — AndyG, Nov 15 '17 at 19:50
See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1533.htm — T.C., Nov 15 '17 at 21:08
@T.C. - thanks! Good to see it has explicitly been discussed. — BeeOnRope, Nov 16 '17 at 07:56

AndyG · Accepted Answer · 2017-11-15T20:08:34.203

12

In C11 and C++17 (emphasis mine)

void *memchr(const void *s, int c, size_t n);

The memchr function locates the first occurrence of c (converted to an unsigned char) in the initial n characters (each interpreted as unsigned char) of the object pointed to by s. The implementation shall behave as if it reads the characters sequentially and stops as soon as a matching character is found.

So long as memchr finds what it's looking for before you step out of bounds, you're okay.

C++11 and C++14 both use C99, which doesn't have such wording. (They refer to ISO/IEC 9899:1999)

C99 wording:

void *memchr(const void *s, int c, size_t n);

The memchr function locates the first occurrence of c (converted to an unsigned char) in the initial n characters (each interpreted as unsigned char) of the object pointed to by s.

By not defining what happens if you pass too large of a size, the behavior is undefined in C99

edited Nov 15 '17 at 20:08

answered Nov 15 '17 at 19:47

AndyG

39,700
8
109
143

1

Ok, that does explicitly disallow an asm implementation that assumes it can safely touch all the bytes out to `buf[count-1]` (e.g. with a SIMD vector), without some kind of implementation-specific check that it can't actually fault (like [checking that it doesn't cross a page boundary on a typical CPU](https://stackoverflow.com/questions/37800739/is-it-safe-to-read-past-the-end-of-a-buffer-within-the-same-page-on-x86-and-x64). – Peter Cordes Nov 15 '17 at 19:52
What is the wording in `C99`? – BeeOnRope Nov 15 '17 at 20:05
1

@BeeOnRope: Updated post with exact wording from both C99 and C11 – AndyG Nov 15 '17 at 20:08
1

@JohannesSchaub-litb: It certainly doesn't disallow SIMD, it just mildly constrains your start-up options for doing an unaligned multi-byte load. (Aligned loads can never cross cache-line or page boundaries, so they're always safe if they include any bytes you know you're allowed to touch.) Interesting point about a C implementation that catches access violations, though. That would be possible if it didn't provide POSIX signal handling, e.g. maybe a bare-metal one that didn't expose #GP exceptions in C. – Peter Cordes Nov 15 '17 at 20:12
I suppose you could also have an implementation that still exposed the #GP exceptions in C, but only after filtering them. I.e., the implementation handles them initially, checks if they are caused by one of the "might GP" routines and handles them internally in that case, otherwise passes them on to the user handler. Java is a little bit like that: it uses GPs to catch unusual cases like rare null pointers access (implicit null), and recovers if the GP occurred in such a context. In other contexts it will panic with info about the GP (dunno if user native code can handle GP in that case). – BeeOnRope Nov 15 '17 at 20:18
@PeterCordes sorry, I figured I would first read more carefully to comment again. Looks like your reply raced with my deletion. – Johannes Schaub - litb Nov 15 '17 at 20:23
@PeterCordes I think I understand now. You say if the first few bytes are within the last word of the page, and it starts SIMD-reading from the provided address, the vector could cross a page boundary. It seems that would be a poor implementation, because all subsequent SIMD-reads would also be unaligned. I would expect the implementation to have a pre-loop code that patches the address up to be aligned first, reading the first bytes using "char*". – Johannes Schaub - litb Nov 15 '17 at 20:30
@JohannesSchaub-litb: often a good SIMD strategy is a (possibly) unaligned first vector, then aligned later vectors. Overlap between the unaligned load and the aligned loads happens only if the pointer is actually not aligned. It's potentially good for `memchr` because all results from 0 to 15 have the same branching. (packed compare -> count is branchless) [See discussion on another Q&A about memchr strategies.](https://stackoverflow.com/questions/43483378/why-arent-stdcount-and-stdfind-optimised-to-use-memchr/47301272?noredirect=1#comment81583474_47301272) – Peter Cordes Nov 15 '17 at 20:40
@JohannesSchaub-litb - in addition to what has been mentioned there, unaligned access has gotten _a lot_ better in the last decade on mainstream implementations such as Intel x86. On the last few generations of Intel chips the unaligned penalty has disappeared or nearly so for many algorithms. For example, Intel chips have basically zero penalty for unaligned loads that don't cross a cache line, and those that do can still execute 1 per cycle (versus two per cycle for non-crossing): since most loops probably don't exceed 1 load/cycle even this penalty is often invisible. – BeeOnRope Nov 15 '17 at 20:56
2

So setting up "lead in" code that tries to get to an aligned state might kill your performance with short strings/regions, which is often exactly what matters. A reasonable strategy is to do some unaligned (cheap) stuff for a few iterations until you have reached some crossover point at which point you assume the string is long enough and you flip to aligned. Sometimes just never flipping to aligned at all is probably better on modern hardware. – BeeOnRope Nov 15 '17 at 20:58
I'll remark that changes between standard revisions to the behaviour of library functions can be risky to rely on in practice, since compilers and libraries are developed separately; e.g. there are plenty of implementations of GCC 7 for Windows, that don't support %zu in printf – M.M Nov 16 '17 at 03:24

Is it legal to call memchr with a too-long length, if you know the character will be found before reaching the end of the valid region?

1 Answers1

Linked