18

I have a piece of memory I am "guarding", defined by

typedef unsigned char byte;

byte * guardArea;
size_t guardSize;

byte * guardArea = getGuardArea();
size_t guardSize = getGuardSize();

An acceptable implementation for the sake of this would be:

size_t glGuardSize = 1024; /* protect an area of 1kb */
byte * getGuardArea()
{
     return malloc( glGuardSize );
}
size_t getGuardSize()
{
     return glGuardSize;
}

Can the following snippet return true for any pointer (from a different malloc, from the stack etc)?

if ( ptr >= guardArea && ptr < (guardArea + guardSize)) {
     return true;
}

The standard states that:

  • values within the area will return true. (When ptr was a member, all acts correctly.)

  • pointers will be distinct (a == b only if they are the same).

  • all addresses within the byte array can be accessed by incrementing the base.
  • any pointer can be converted to and from a char *, without damage.

So I can't understand how the result could be true for any pointer from a different object (as it would break the distinct rule for one of the pointers within the area).

Edit:

What is the use case?

The ability to detect whether a pointer is within a region is really important, at some point code is written

if (  isInMyAreaOfInterest( unknownPointer ) ) {
    doMySpecialThing( unknownPointer );
} else {
    doSomethingElse( unknownPointer );
}

I think the language needs to support the developer by making such constructs simple and obvious, and our interpretation of the standard, is that the developer needs to cast to int. Due to the "undefined behavior" of pointer comparisons of distinct objects.

I was hoping for some clarity of why I can't do what I would like (my snippet), as all the posts on SO I found say that the standard claims undefined behavior, without any explanation, or examples of why the standard is better than how I would like it to work.

At the moment, we have a rule, we are neither understanding why the rule exists, or questioning if the rule is helping us

Example posts:

SO: checking if a pointer is in a malloced area

SO: C compare pointers

Andreas Rejbrand
  • 105,602
  • 8
  • 282
  • 384
mksteve
  • 12,614
  • 3
  • 28
  • 50
  • It actually depends on the implementation of `getGuardArea()` and `getGuardSize();`. As long as you don't elaborate more, this question cannot be answered. – Jabberwocky Aug 26 '16 at 07:34
  • 1
    Just for curiosity, what is the use case? – rocambille Aug 26 '16 at 07:35
  • 1
    I still don't get your problem. If you get some memory with `malloc`, you can be sure that block of memory won't be assigned again (unless `malloc` on your system is awfully broken). Conceptually your snippet works. What's the problem? The casting of pointers? Can you link to any of these posts that talk about UB? – Fabio says Reinstate Monica Aug 26 '16 at 07:54
  • *So I can't understand how the result could be true for any pointer from a different object.* The premise of the question is wrong. If two pointers to different objects are compared using relational operator then this results in undefined behavior. The result isn't false or true, instead anything is allowed to happen, as it is undefined behavior, explained here:http://stackoverflow.com/a/4105123/4082723 – 2501 Aug 26 '16 at 07:58
  • @2501 In practice though, pretty much all real-world system will treat pointers as addresses, and addresses are just integers. An integer comparison executed by a real world CPU will not cause crashes or strange behavior, no matter what the C standard happens to say. You just can't know what result it will yield in case the pointers point at different objects, and that's all. – Lundin Aug 26 '16 at 08:04
  • 1
    @Lundin: There is no real-world reason why compilers written by non-obtuse programmers for modern linear-address machines should not interpret relational operators as defining a consistent non-overlapping ordering among all data pointers. Unfortunately, compiler behavior which would for decades have been correctly recognized as obtuse is today regarded as fashionable. – supercat Aug 26 '16 at 20:46

2 Answers2

43

It is still possible for an allocation to generate a pointer that satisfies the condition despite the pointer not pointing into the region. This will happen, for example, on an 80286 in protected mode, which is used by Windows 3.x in Standard mode and OS/2 1.x.

In this system, pointers are 32-bit values, split into two 16-bit parts, traditionally written as XXXX:YYYY. The first 16-bit part (XXXX) is the "selector", which chooses a bank of 64KB. The second 16-bit part (YYYY) is the "offset", which chooses a byte within that 64KB bank. (It's more complicated than this, but let's just leave it at that for the purpose of this discussion.)

Memory blocks larger than 64KB are broken up into 64KB chunks. To move from one chunk to the next, you add 8 to the selector. For example, the byte after 0101:FFFF is 0109:0000.

But why do you add 8 to move to the next selector? Why not just increment the selector? Because the bottom three bits of the selector are used for other things.

In particular, the bottom bit of the selector is used to choose the selector table. (Let's ignore bits 1 and 2 since they are not relevant to the discussion. Assume for convenience that they are always zero.)

There are two selector tables, the Global Selector Table (for memory shared across all processes) and the Local Selector Table (for memory private to a single process). Therefore, the selectors available for process private memory are 0001, 0009, 0011, 0019, etc. Meanwhile, the selectors available for global memory are 0008, 0010, 0018, 0020, etc. (Selector 0000 is reserved.)

Okay, now we can set up our counter-example. Suppose guardArea = 0101:0000 and guardSize = 0x00020000. This means that the guarded addresses are 0101:0000 through 0101:FFFF and 0109:0000 through 0109:FFFF. Furthermore, guardArea + guardSize = 0111:0000.

Meanwhile, suppose there is some global memory that happens to be allocated at 0108:0000. This is a global memory allocation because the selector is an even number.

Observe that the global memory allocation is not part of the guarded region, but its pointer value does satisfy the numeric inequality 0101:0000 <= 0108:0000 < 0111:0000.

Bonus chatter: Even on CPU architectures with a flat memory model, the test can fail. Modern compilers take advantage of undefined behavior and optimize accordingly. If they see a relational comparison between pointers, they are permitted to assume that the pointers point into the same array (or one past the last element of that array). Specifically, the only pointers that can legally be compared with guardArea are the ones of the form guardArea, guardArea+1, guardArea+2, ..., guardArea + guardSize. For all of these pointers, the condition ptr >= guardArea is true and can therefore be optimized out, reducing your test to

if (ptr < (guardArea + guardSize))

which will now be satisfied for pointers that are numerically less than guardArea.

Moral of the story: This code is not safe, not even on flat architectures.

But all is not lost: The pointer-to-integer conversion is implementation-defined, which means that your implementation must document how it works. If your implementation defines the pointer-to-integer conversion as producing the numeric value of the pointer, and you know that you are on a flat architecture, then what you can do is compare integers rather than pointers. Integer comparisons are not constrained in the same way that pointer comparisons are.

if ((uintptr_t)ptr >= (uintptr_t)guardArea &&
    (uintptr_t)ptr < (uintptr_t)guardArea + (uintptr_t)guardSize)
Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
  • The problem with such systems were probably just horribly bad compilers, that used two different pointer types, one 16 bit pointer and one "far" pointer which includes the bank. (far pointers were never covered by the C standard.) Nobody should pick systems with banked memory for new designs in the year 2016 though. Sure, they live on in various icky 8 bit and 16 bit microcontroller implementations, but those are quickly turning just as obsolete as 80286. Anyone who decides to port their code to such a system deserve all the bugs they can get. – Lundin Aug 26 '16 at 08:12
  • 9
    @Lundin The question did not say "Assume I am not on an icky processor." See also the bonus chatter, which demonstrates that this code is unsafe even on a modern flat architecture. – Raymond Chen Aug 26 '16 at 08:20
  • 7
    The *"Bonus chatter"* portion of this answer perfectly illustrates a major flaw in contemporary (2016) optimizing compilers. If the programmer writes `if (ptr>=guardArea && ptr<(guardArea + guardSize))` and the compiler silently drops the first condition, then that compiler is seriously broken. Detecting undefined behavior is ***not*** an opportunity for optimization, it ***is*** an opportunity to issue a diagnostic. – user3386109 Aug 26 '16 at 18:44
  • 5
    @user3386109 On the other hand, sometimes you *want* the optimization. E.g. `bool inRange(char*p,char*start, int length){return p>=start && p – Raymond Chen Aug 26 '16 at 19:56
  • @RaymondChen: The authors of C89 assumed that if compilers for targets with certain obviously-useful features (like a fully-defined non-overlapping ordering of all pointers) were consistently supporting such features before the Standard was written, there was no need for the Standard to mandate such support on platforms where it would provide great benefits at essentially zero cost. The more obvious the benefit, the less need for a mandate. I'm not sure what exactly promoted the attitude that programmers should rely on nothing that the Standard doesn't mandate, but it... – supercat Aug 26 '16 at 19:59
  • ...makes things less efficient rather than more. With regard to your example, if the compiler could tell that the call to `findSomething` will always return a pointer to part of the array without relying upon the UBness of pointer comparisons, it could omit the first test without relying upon the UBness of pointer comparisons. If the `findSomething` function could return a pointer outside the array, however, omitting the first test would be bad. I fail to see any benefit to the UBness of the behavior in question on platforms with linear addressing. – supercat Aug 26 '16 at 20:05
  • @RaymondChen: Some compilers might be able to convert most uses of the portable `int needleInHaystack(unsigned char *needle, unsigned char *haystack, size_t haystack_size) { while(haystack_size--) if (needle == haystack++) return 1;} return 0;}` into a single addition or subtraction and one or two comparisons, but requiring that programmers replace the comparisons with a loop and hope a compiler replaces the loop with the fixed-time comparisons they wanted to write in the first place seems rather icky. – supercat Aug 26 '16 at 20:09
  • I would write the two `if` statements as `if (p-array < 40)` and `if (p-array >= 40)`. So the diagnostic would be a useful indication that the original code is sloppy and verbose, and should be rewritten to be clear and concise. The compiler should implement what the programmer wrote. It should never be making decisions that *"oh, you didn't really mean that, so I'm going to do something different."* Especially without giving any indication that that's what happened. – user3386109 Aug 26 '16 at 20:30
  • @user3386109 Sure, maybe you would write it that way if it was just halves. But suppose you did this: `for (int decile = 0; decile < 40; decile += 4) if (inRange(p, decile, decile + 4) { ... }`. The compiler decides to unroll the loop, and then optimizes the first iteration. Or would you prefer to manually unroll the loop 10x, and then manually optimize the first iteration? – Raymond Chen Aug 26 '16 at 21:14
  • That code generates a warning: *"incompatible integer to pointer conversion passing `int` to parameter of type `char *`"* Assuming that you get that little problem fixed, it should be clear that using a loop to find the first `decile` is inefficient. It's simple math to compute the first `decile` where `p` is in range. You don't need a loop for that. We could do this all day ... but I've had enough of it for today. – user3386109 Aug 26 '16 at 21:35
  • 11
    @user3386109 I should give up trying to come up with specific examples. The point is that inlining, loop unrolling, constant propagation, template expansion, and many other compiler transformations can expose valid optimization opportunities that you would want the compiler to exploit. – Raymond Chen Aug 27 '16 at 03:36
  • Note that GDT vs LDT is actually bit 2, not bit 0. – Yuhong Bao Sep 27 '17 at 20:04
  • The second condition should be written as the algebraically equivalent `ptr - guardArea < guardSize` (add casts). This form is guaranteed not to overflow, where the original is not. That's a concern because such code often appears in contexts where untrusted input determines one or more of the values used in the expression, and an overflow could open a security vulnerability. Note that `ptr - guardArea` cannot overflow because the first condition ensures that it remains positive. – Phil Miller Sep 27 '17 at 21:34
  • 1
    @YuhongBao I know, but I simplified the architecture for the purpose of discussion. Whether the LDT bit is 0, 1, or 2 is immaterial. – Raymond Chen Sep 27 '17 at 23:21
  • 1
    @user3386109: Compilers don’t “detect” undefined behavior and then mis-compile the code with no diagnostic out of spite. Lacking the runtime values of everything, as of course they do, they can at most detect the *possibility* of undefined behavior (which exists, unrealized, in most correct programs) and apply transformations that will make the program faster without changing its meaning in any case where undefined behavior does not occur. (Not changing its meaning in all cases isn’t even an option if once we accept that erroneous operations like buffer overflows don’t *have* a meaning.) – Davis Herring Sep 24 '21 at 13:24
1

Yes.

void foo(void) {}
void(*a) = foo;
void *b = malloc(69);
uintptr_t ua = a, ub = b;

ua and ub are in fact permitted to have the same value. This occurred frequently on segmented systems (like MS-DOS) which might put code and data in separate segments.

geocar
  • 9,085
  • 1
  • 29
  • 37