2

After reading this, my understanding is that the following program should invoke UB. Am I right?

int main(void)
{
        char *ptr = "ABCD";
        ptr = 'A';
        printf("%c\n", ptr);
}

Thanks.

Community
  • 1
  • 1
babon
  • 3,615
  • 2
  • 20
  • 20
  • 2
    _Yes, you are right_ . But probably there wasn't a need to post a question for this as there are many similar questions and answers with explanation. – ameyCU Dec 10 '16 at 09:43
  • Got rejected at an interview for saying this is UB. I felt like the interviewer had never heard what UB is. Anyways, will delete this post. – babon Dec 10 '16 at 09:44
  • 1
    @PhilipCouling No errors on my machine. Just a warning about possible incorrect conversion of a pointer. I am using GCC 6.2.1. Could you please explain more? – babon Dec 10 '16 at 09:48
  • 4
    To me the selected duplicate seem very strange - just can't see how they are equal. – Support Ukraine Dec 10 '16 at 10:18
  • 6
    This is not a duplicate (at least not of the pointed question). This question is about cast of integer to pointer and passing pointer instead of integer as argument of `printf`. The pointed question is about modification of a constant string. – Marian Dec 10 '16 at 10:21
  • @WeatherVane and thus I am really misreading code today :-( I read `printf("%s\n", ptr);` which explains my confusion – Philip Couling Dec 10 '16 at 10:31
  • 1
    Why the downvotes? This is a precise, short and to the point, well-worded question, with a MCVE and a reference of research w/ a related question - practically the holy grail of SO. – Michael Foukarakis Dec 10 '16 at 11:31

3 Answers3

3

You are correct, the posted code does invoke undefined behavior in multiple ways.

You probably do not want to work in a company where such code is considered OK.

When I compile the code with my default clang options, I get 4 warnings:

clang -O2 -funsigned-char -std=c11 -Weverything -Wno-padded -Wno-shorten-64-to-32 -Wno-mis\
sing-prototypes -Wno-vla -Wno-missing-noreturn -Wno-sign-conversion -Wno-missing-variable-\
declarations -Wno-unused-parameter -Wwrite-strings -lm -o ub2 ub2.c
ub2.c:3:11: warning: initializing 'char *' with an expression of type 'const char [5]' dis\
cards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
    char *ptr = "ABCD";
          ^     ~~~~~~
ub2.c:4:9: warning: incompatible integer to pointer conversion assigning to 'char *' from \
'int' [-Wint-conversion]
    ptr = 'A';
        ^ ~~~
ub2.c:5:5: warning: implicitly declaring library function 'printf' with type 'int (const c\
har *, ...)'
    printf("%c\n", ptr);
    ^
ub2.c:5:5: note: include the header <stdio.h> or explicitly provide a declaration for 'pri\
ntf'
ub2.c:5:20: warning: format specifies type 'int' but the argument has type 'char *' [-Wfor\
mat]
    printf("%c\n", ptr);
            ~~     ^~~
            %s
4 warnings generated.

3 of these warnings indicate possible undefined behavior:

  • assigning an integer value to a pointer 'A' invokes undefined behavior except for the integer constant 0 which is converted to the null pointer. Some systems will trigger an exception for some values, some will just store the integer value into the location of the pointer, one cannot rely on it. you will see code like ptr = (char*)0x0040006C; that may compile correctly on the specific system for which it is tailored, but it only works for specific compiler / target combinations.

  • passing a pointer for a %c conversion specifier is explicitly described as invoking undefined behavior in the C Standard. Pointers might be passed to printf in a different way than int values, in a different set of registers for example, so printf would not receive the value of the pointer to print as a character. This value might not be 'A' anyway, even if properly cast back as (int)ptr.

  • calling printf without a proper declaration in scope invokes undefined behavior. The implicit prototype inferred by the compiler from the arguments passed might be incompatible with the varargs calling convention used by printf. You must include <stdio.h> or at least provide a valid prototype before calling printf.

On a lesser note, there are some more remarks:

  • "ABCD" is a string literal. It must not be written to. For compatibility with tons of legacy code, the C Standard (relunctantly) gives it a type of char[5] where it should really be const char[5]. This explains why you do not get a warning by default on char *ptr = "ABCD";, but it is wise to allow the compiler to be stricter than the Standard and warn the programmer about this. const correctness may require extensive changes in large projects, but will prevent potential undefined behavior and improve the compilers ability to optimise code.

  • returning 0 from main() is implicit since C99, but it is considered cood style to have an explicit return 0; to indicate success.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • I love when compiler says "What is printf? you don't include prototype but be aware that you use it bad because in fact I know what is printf" – Stargateur Dec 10 '16 at 12:10
1

A c-string is a series of char. Terminate by the char '\0'.

char *ptr = "ABCD";

Create an array of 5 chars. {'A', 'B', 'C', 'D', '\0'} somewhere in static memory and assign the address of the first char (aka. 'A') to ptr.

When you do

ptr = 'A';

You assign the value of 'A' to ptr, It's not a valid pointer or c-string so de-referencing the pointer cause an undefined behavior.

Plus, C is a typed language you can't put anything in anything. It's no guarantee that after assign 'A' to ptr. That ptr will be equal to 'A'. ptr has type char *, 'A' has type char. You can't mix these types.

char *ptr = 'A';
if (ptr == 'A') // this is undefined behavior
  *ptr; // same here

Here what you must write if you want do what you try.

char c = 'A';
char *ptr = &c; // here ptr is not a valid c-string, it's just a pointer for one char

printf("%c", *ptr);
Community
  • 1
  • 1
Stargateur
  • 24,473
  • 8
  • 65
  • 91
  • Even if @WeatherVane is correct, on some implementations just setting a pointer to an invalid address may trigger UB. – Déjà vu Dec 10 '16 at 11:37
  • @WeatherVane It's better now? I think it was the most important to explain because keep a invalid pointer is really really bad. – Stargateur Dec 10 '16 at 11:44
-1

Here ptr points to a location in the data section(Read-only) section of the program. You modified ptr='A'; Here ptr points to ascii value i.e 65 .So ptr points to the value at 65 location hence an undefined behavior

Anjaneyulu
  • 434
  • 5
  • 21