8

This seems to be a fairly common pattern e.g. in hexchat (may not compile, see also plugin docs. also note that hexchat_plugin_get_info hasn't been used in forever so I'm omitting it for simplicity):

static hexchat_plugin *ph;
static int timer_cb(void *userdata) {
    if (hexchat_set_context(ph, userdata)) { /* <-- is this line UB? */
        /* omitted */
    }
    return 0;
}
static int do_ub(char *word[], char *word_eol[], void *userdata) {
    void *context = hexchat_get_context(ph);
    hexchat_hook_timer(ph, 1000, timer_cb, context);
    hexchat_command(ph, "close"); /* free the context - in practice this would be done by another plugin or by the user, not like this, but for the purposes of this example this simulates the user closing the context. */
    return HEXCHAT_EAT_ALL;
}
int hexchat_plugin_init(hexchat_plugin *plugin_handle, char **plugin_name, char **plugin_desc, char **plugin_version, char *arg) {
    *plugin_name = "do_ub";
    *plugin_desc = "does ub when you /do_ub";
    *plugin_version = "1.0.0";
    ph = plugin_handle;
    /* etc */
    hexchat_hook_command(ph, "do_ub", 0, do_ub, "does UB", NULL);
    return 1;
}

The line in timer_cb causes hexchat to compare the (potentially free'd - definitely free'd in this example, see the comment in do_ub) pointer with another pointer, if you follow from here (plugin.c#L1089, hexchat_set_context) you'll end up in here (hexchat.c#L191, is_session). To invoke this code, run /do_ub in hexchat.

Relevant code:

int
hexchat_set_context (hexchat_plugin *ph, hexchat_context *context)
{
    if (is_session (context))
    {
        ph->context = context;
        return 1;
    }
    return 0;
}

int
is_session (session * sess)
{
    return g_slist_find (sess_list, sess) ? 1 : 0;
}

Is this sort of thing UB?

SoniEx2
  • 1,864
  • 3
  • 27
  • 40
  • I've seen libraries where functions take pointers, see if they are NULL and allocate something for those if they are NULL, otherwise just work with where they point at. So, I doubt it's UB. – Blaze Oct 03 '18 at 14:03
  • 3
    Why is `userdata` potentially freed? If something can still access it then it should still exist. If it's freed then another object can get allocated at the same memory location. – Kevin Oct 03 '18 at 14:07
  • 3
    Using the value of freed pointer for anything is UB. – Eugene Sh. Oct 03 '18 at 14:07
  • Where is the pointer being "freed". Are you concerned another function deallocated or dereferenced the location? – Hogstrom Oct 03 '18 at 14:07
  • The value of the pointer after being freed is indeterminate, as told [here](https://stackoverflow.com/questions/47898934/new-pointer-by-malloc-is-same-as-one-of-the-freed-old-pointers/47902236#47902236), [here](https://stackoverflow.com/questions/11239707/why-are-the-contents-pointed-to-by-a-pointer-not-changed-when-memory-is-dealloca/47385107#47385107), and [here](https://stackoverflow.com/questions/52268868/understanding-dangling-pointer-behaviour-in-this-case/52268958#52268958). – StoryTeller - Unslander Monica Oct 03 '18 at 14:07
  • 4
    Can you show the relevant code instead of pasting links? – Ajay Brahmakshatriya Oct 03 '18 at 14:07
  • 1
    This example is fairly unclear... what is `ph`? – Lundin Oct 03 '18 at 14:21
  • 5
    In C, a comparison with a free'd pointer is UB (unless the free'd pointer is a _null pointer_). Even assignment `p = freed_pointer;` is UB. Unknown how this applies within hexchat or OP's question and code here lacks completeness. – chux - Reinstate Monica Oct 03 '18 at 14:35
  • Jerking a pointer that you passed to a library's init() function is never a very good idea. Not exactly UB in the library when the usage is obvious, a session would be expected to always survive a chat event. – Hans Passant Oct 03 '18 at 14:42
  • why the hold? This is a very good question. Its clearly stated, with sample code. The answer provided below is very interesting and provides a great answer. I learned something from the exchange – pm100 Oct 03 '18 at 15:49

3 Answers3

7

Using a value of a pointer after the object it is pointing to have reached it's lifetime end is indeterminate as stated in the C11 Standard draft 6.2.4p2 (Storage durations of objects) (the emphasis is mine):

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

And using it's value (just for anything) is an explicit undefined behavior as stated in Annex J.2(Undefined behavior):

The behavior is undefined in the following circumstances: [...] The value of a pointer to an object whose lifetime has ended is used (6.2.4).

Eugene Sh.
  • 17,802
  • 8
  • 40
  • 61
  • Was about to answer with exactly the same two paragraphs but from the C17 draft. Would maybe add though that reassigning the variable is okay, since assigning to `NULL` after `free(ptr)` is a common pattern. – nullp0tr Oct 03 '18 at 14:49
  • @nullp0tr Obviously saying `free(p); p = NULL;` is okay. But `free(p); q = p;` is not. The latter case is what I think people mean when they say "assigning a freed pointer". – Steve Summit Oct 03 '18 at 15:00
  • Yeah, using pointer variable to assign it with something else is OK, using it's content as it's left by `free` is not. – Eugene Sh. Oct 03 '18 at 15:15
  • Note; The narrow exception in the [comment](https://stackoverflow.com/questions/52628773/does-comparing-a-pointer-that-has-been-freed-invoke-ub#comment92192331_52628773) was to imply `void *p = malloc(something_that_caused_a_return_of_NULL); ... free(p); ... void *q = p;` was OK. – chux - Reinstate Monica Oct 03 '18 at 16:33
  • With UB due to "The value of a pointer to an object whose lifetime has ended is used", the "value of a pointer becomes indeterminate" seems irrelevant. To me, the first is enough. The "is used" includes a simple assignment or reading attempt of its value. Since these are UB, what does "value of a pointer becomes indeterminate" gain as there is no way, AFAIK, to see that indeterminate value? I suppose that could be a post-able question. – chux - Reinstate Monica Oct 03 '18 at 16:45
  • 1
    @chux The first one is from the standard body, which does not directly imply UB. The second one is Annex, which I see as kind of clarification, as the use of indeterminate values might be interpreted as valid in some circumstances. – Eugene Sh. Oct 03 '18 at 16:48
  • @chux: An access to an Indeterminate Value of a type that has trap representations invokes UB. Under C89, if a type has no trap representations, an Indeterminate value is equivalent to an Unspecified value (since an Indeterminate Value is defined as being either an Unspecified value or trap representation). The fact that Annex J specifies a category of actions that generally invoke UB doesn't imply that all such actions invoke UB. – supercat Oct 03 '18 at 18:41
3

Yes, using a pointer value that has been freed for anything -- even a seemingly-innocuous comparison -- is, strictly speaking, undefined behavior. It's unlikely to cause any actual problems in practice, but I'd say it's worth avoiding.

See also the C FAQ list, question 7.21.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • Not sure if comparing a freed pointer value is undefined bahaviour, but it is pointless for sure. – Jabberwocky Oct 03 '18 at 14:43
  • 2
    @Jabberwocky: In some cases, checking whether `realloc()` has moved a block may avoid the need to recalculate pointers derived from its address. In cases where the `realloc()` is used to shrink a block after the final size becomes known, the scenario where the block stays put may be more common than the one where it moves. – supercat Oct 03 '18 at 18:42
1

tl;dr: The ability to perform certain operations such as comparisons on pointers without regard for the lifetime of objects identified thereby is a popular extension which the vast majority of compilers can be configured to support with optimizations disabled. Support for it is not mandated by the Standard, however, and aggressive optimizers may break code which relies upon it.

When the Standard was written, there were some segmented-memory platforms where attempting to load a pointer into registers would cause the system to retrieve information about the region of memory where the pointer resided. If such information was no longer available, an attempt to retrieve it could have arbitrary consequences outside the jurisdiction of the Standard. For the Standard to require that comparisons involving such pointers have no side effects beyond yielding 0 or 1 would have made the language impractical on such platforms.

While the authors of the Standard were no doubt aware that being able to use comparisons with arbitrary pointers (subject to the caveat that the results may not be particularly meaningful) was a useful feature supported by every implementation targeting conventional hardware, they saw no need to treat it as anything more than a "popular extension" which quality implementations support whenever doing so would be useful and practical.

From C89 Rationale, p.11 line 23:

The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Informative Annex J of the Standard catalogs those behaviors which fall into one of these three categories.

Unfortunately, even though nearly all platforms in use today could support such semantics at essentially zero cost (*), some compiler writers regard their desire to assume that code will never do anything with freed pointers as more important than any value that programmers could receive from what had been an essentially-universally-supported extension on conventional platforms. Unless one can guarantee that anyone using one's code will disable phony "optimizations" imposed by the the authors of over-eager optimizers who seek to rid the language of useful extensions, one may have to write extra code to work around the absence of such extensions.

(*) In some scenarios where a function exposes to outside code multiple pointers to a region of storage it has allocated and freed, a compiler that had to uphold a behavioral guarantee that they such pointers will compare equal would be consequently required to actually perform store operations that would leak the pointers; treating the pointers as indeterminate would allow the stores to be eliminated. Outside of contrived scenarios, however, the cost savings from eliminating such stores with pointers that are leaked to the outside world would rarely have any meaningful effect on performance.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • "essentially no cost" i.e. keeping the pointer value in registers indefinitely or solving the halting problem to prove that they're not needed. – Antti Haapala -- Слава Україні Sep 14 '19 at 17:23
  • @AnttiHaapala: What do you mean? In most situations either a compiler will be able to see everything that is done with pointers to an allocation, or it will have to perform actual pointer computations with the allocation address. If either situation applies, the cost would be zero. The only situation in which there would be any cost would be if pointers could be shown to escape after the associated storage is freed. Achieving perfect optimization in that case would be difficult, but those cases are rare enough that the overall cost is essentially zero in non-contrived scenarios. – supercat Sep 15 '19 at 03:25