18

Imagine I have this C function (and the corresponding prototype in a header file)

void clearstring(const char *data) {
    char *dst = (char *)data;
    *dst = 0;
}

Is there Undefined Behaviour in the above code, casting the const away, or is it just a terribly bad programming practice?

Suppose there are no const-qualified objects used

char name[] = "pmg";
clearstring(name);
pmg
  • 106,608
  • 13
  • 126
  • 198
  • 2
    If the cast isn't UB, I think it should be :) – pmg Jan 31 '12 at 11:55
  • you certainly have your foot squarely in the shotgun sights! – Rob Agar Jan 31 '12 at 11:57
  • 1
    @pmg: if the cast itself were UB, then there would be little point the language permitting it - it's easy enough for a compiler to detect that `const` has been added in a cast, the same way it detects that `char *dst = data;` is illegal. Obviously there are some pointless things that the standard permits for historical reasons, but I claim that this is not one of them :-) – Steve Jessop Jan 31 '12 at 12:08
  • Does this answer your question? [Is const\_cast safe?](https://stackoverflow.com/questions/357600/is-const-cast-safe) – user202729 Jan 19 '22 at 16:12
  • @user202729: by the gist of it, yes. It says approximately the same as answers here, but with a flavor of C++. – pmg Jan 19 '22 at 16:24
  • Sorry, wrong language, retracted. – user202729 Jan 19 '22 at 16:25

2 Answers2

29

The attempt to write to *dst is UB if the caller passes you a pointer to a const object, or a pointer to a string literal.

But if the caller passes you a pointer to data that in fact is mutable, then behavior is defined. Creating a const char* that points to a modifiable char doesn't make that char immutable.

So:

char c;
clearstring(&c);    // OK, sets c to 0
char *p = malloc(100);
if (p) {
    clearstring(p); // OK, p now points to an empty string
    free(p);
}
const char d = 0;
clearstring(&d);    // UB
clearstring("foo"); // UB

That is, your function is extremely ill-advised, because it is so easy for a caller to cause UB. But it is in fact possible to use it with defined behavior.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • 9
    +1: (im)mutability is an inherent property of the object itself, regardless of the qualification of the pointer used to access it... – Christoph Jan 31 '12 at 12:01
  • Is this UB because of `C99 6.6 §9` or because of `C99 6.7.3 §5`? – Lundin Jan 31 '12 at 14:13
  • 1
    @Lundin: the latter (and 6.4.5/6 rather than 6.7.3/5 in the case of the string literal, since string literals are not `const` objects in C). Address constants have nothing to do with this. – Steve Jessop Jan 31 '12 at 14:24
0

Consider a function like strstr which, if given a pointer to a part of an object containing a string, with return a pointer to a possibly-different part of the same object. If the method is passed a pointer to a read-only area of memory, it will return a pointer to a read-only area of memory; likewise if it is given a pointer to a writable area, it will return a pointer to a writable area.

There is no way in C to have a function return a const char * when given a const char *, and return an ordinary char * when given an ordinary char *. In order to be compatible with the way strstr worked before the idea of a const char * was added to the language, it has to convert a const-qualified pointer into a non-const-qualified pointer. While it's true that as a library function strstr might be entitled to do such a cast even if user code could not, the same pattern comes up often enough in user code that it would be practical to forbid it.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • As of C11, C has the `_Generic` keyword which can be used to define functions which are generic over a type, and as of C23, `strstr` has been defined to return a pointer which matches the `const`ness of its input. – JM0 Mar 24 '23 at 16:09
  • @JM0: Is it practical to write a function prototype that would allow a function that passes a received argument to `strstr`, and then returns the output of `strstr` to the caller, to advertise such semantics when processed by C23, but which would still be compatible with existing compilers? – supercat Mar 24 '23 at 17:53
  • The paper which introduced the change, [N3020](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3020.pdf), has an example of some macro magic that works on compilers today and makes `strstr` generic over `const`. – JM0 Apr 21 '23 at 02:39
  • @JM0: Interesting. I still think there should be a qualifier to indicate that a returned pointer will be based upon a passed-in pointer, so as to allow a compiler given `int *q = someFunction(p);`, where `p` is a restrict-qualified pointer, could know whether the return value is always, never, or sometimes based upon `p`, and whether the function call may have persisted any pointer based upon `p` other than `q`. – supercat Apr 21 '23 at 16:00
  • @JM0: Being able to express such information in a function signature would be especially useful in cases where functions receive and invoke pointers to outside functions, since even a "whole program optimize" compiler would not in general be able to look inside all of the functions that might be invoked via function pointer. – supercat Apr 21 '23 at 16:16
  • Yeah, the paper introduces the types `QChar`, `QVoid`, and `QWchar_t` to express that, but they're only documentation conventions and aren't actually part of the language. If you want it to actually be part of the language, you're going to have to take it up with the committee. Though good news, compilers can still warn if the return value doesn't match: [example](https://godbolt.org/z/5jzndK8xP). – JM0 Apr 21 '23 at 17:24
  • IMHO, the Committee's priority should be to define meaningful categories of conformance, and recognize more different categories of implementations for different purposes. Back when the compiler marketplace was dominated by compilers that needed their products to appeal to the programmers that would be using them, it made sense for the Standard to waive jurisdiction over anything that might be controversial, treating such things as "quality of implementation" issues, but such an approach breaks down if the only way for a program to have a broad audience is to work around the quirks... – supercat Apr 21 '23 at 18:31
  • ...of a compiler that is deliberately incompatible with code written for better quality compilers, except when processing code in gratuitously inefficient fashion. – supercat Apr 21 '23 at 18:32