4

strcmp compares the string content hence it's preferred over if (str1 == str2) which compares the bases address of the strings.

If so, why would the if condition get satisfied in the below C code:

    char *p2="sample1";
    char* str[2]={"sample1","sample2"};


    if(p2==str[0])
    {
            printf("if condition satisfied\n");
    }

GDB:

(gdb) p p2
$1 = 0x4005f8 "sample1"
(gdb) p str[0]
$2 = 0x4005f8 "sample1"
(gdb) p &p2
$3 = (char **) 0x7fffffffdb38
(gdb) p &str[0]
$4 = (char **) 0x7fffffffdb20
(gdb) p *p2
$5 = 115 's'

What exactly is 0x4005f8 and how do I print it?

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
anurag86
  • 1,635
  • 1
  • 16
  • 31
  • Compiler is optimizing and storing only one instance of `sample1` – kiran Biradar Jun 06 '19 at 09:59
  • `"sample1"` and `"sample2"` are string constants which aren't allowed to change. The compiler is free to recognize duplicate string constants and use the same memory for them, thereby reducing total memory use. – Tom Karzes Jun 06 '19 at 10:00
  • Possible duplicate of [Why is the behaviour of this code undefined in C?](https://stackoverflow.com/questions/16162253/why-is-the-behaviour-of-this-code-undefined-in-c) – alinsoar Jun 06 '19 at 10:15

4 Answers4

5

Whether same string literals will be allocated different storage, or the same storage will be used to indicate all usage of a string literal, is unspecified.

In you case, the string literal "Sample1" has only one copy, and the same address is assigned to both p2 and str[0]. However, this is not guaranteed by the standard.

Quoting C11, chapter 6.4.5

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. [...]

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
2

C language allows for static strings to be unique on program. This means, the compiler is allowed to decide if it optimizes the allocation (once instead of twince) for the static string "sample1".

You initialize p to point to the area where it is stocked and str[0] is also a pointer to the same static string. Hence, if they are equal or not is implementation dependent and the result of checking for equality is undefined.

Quote from 6.4.5p6, String literals

It is unspecified whether these arrays are distinct provided their elements have the appropriate values.

alinsoar
  • 15,386
  • 4
  • 57
  • 74
2

You have declared three strings:

  • sample1, pointed to by p2
  • sample1, pointed to by str[0]
  • sample2, pointed to by str[1]

As these are all "string literals", they cannot be changed, and are stored read-only.

The compiler is permitted to recognise that you only actually have two unique strings, and thus only store the two strings (it depends on the implementation).


What exactly is 0x4005f8?

What you'll find in memory is probably something like this:

0x0000004005f8  's'
0x0000004005f9  'a'
0x0000004005fa  'm'
0x0000004005fb  'p'
0x0000004005fc  'l'
0x0000004005fd  'e'
0x0000004005fe  '1'
0x0000004005ff  '\0'
0x000000400600  's'
0x000000400601  'a'
0x000000400602  'm'
0x000000400603  'p'
0x000000400604  'l'
0x000000400605  'e'
0x000000400606  '2'
0x000000400607  '\0'
...
0x7fffffffdb20  0xf8
0x7fffffffdb21  0x05
0x7fffffffdb22  0x40
0x7fffffffdb23  0x00
0x7fffffffdb24  0x00
0x7fffffffdb25  0x00
0x7fffffffdb26  0x00
0x7fffffffdb27  0x00
...
0x7fffffffdb38  0xf8
0x7fffffffdb39  0x05
0x7fffffffdb3a  0x40
0x7fffffffdb3b  0x00
0x7fffffffdb3c  0x00
0x7fffffffdb3d  0x00
0x7fffffffdb3e  0x00
0x7fffffffdb3f  0x00

That is to say that:

  • The p2 variable:
    • Is located at address 0x7fffffffdb38
    • Has a value of 0x4005f8
  • The str[0] variable:
    • Is located at address 0x7fffffffdb20
    • Has a value of 0x4005f8
  • The memory address 0x4005f8 is the beginning of the sample1 string, i.e: the s character
  • The memory address 0x4005f9 is the next charater of the sample1 string, i.e: the a character
  • ... 0x4005fa is m
  • ... 0x4005fb is p
  • ... 0x4005fc is l
  • ... 0x4005fd is e
  • ... 0x4005fe is 1
  • ... 0x4005ff is \0 or "nul", which terminates the string

When you test p2 == str[0], you test that the value stored in both variables are the same. The values are the base address of the string. They hold the "same string, and thus hold the same values.

It is entirely feasible to store the "same" string (i.e: the same text) in two different memory locations, and in such a situation this test would fail.

What you're effectively saying here that the two strings are the "same instance", they reside at the same place in memory, and thus must have the same content.

... and how do I print it?

You can either print is as a single character at a time using x/1c, or as a nul-terminated string using x/1s (gdb handles C strings properly).


main.c:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
        char *p2 = "sample1";
        char *str[2] = { "sample1", "sample2" };

        if (p2 == str[0]) {
                printf("true\n");
        }

        return 0;
}

Compile:

gcc main.c -o main -g

Run:

$ gdb ./main
[...]
(gdb) start
Temporary breakpoint 1 at 0x4005a5: file main.c, line 4.
Starting program: /home/attie/stackoverflow/56475101/main

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe418) at main.c:4
4       int main(int argc, char *argv[]) {
(gdb) list
1       #include <stdio.h>
2       #include <stdlib.h>
3
4       int main(int argc, char *argv[]) {
5               char *p2 = "sample1";
6               char *str[2] = { "sample1", "sample2" };
7
8               if (p2 == str[0]) {
9                       printf("true\n");
10              }
(gdb) b 8
Breakpoint 2 at 0x4005cc: file main.c, line 8.
(gdb) c
Continuing.

Breakpoint 2, main (argc=1, argv=0x7fffffffe418) at main.c:8
8               if (p2 == str[0]) {
(gdb) print p2
$1 = 0x400684 "sample1"
(gdb) print str[0]
$2 = 0x400684 "sample1"
(gdb) print str[1]
$3 = 0x40068c "sample2"

Print three "strings" from address 0x400684:

(gdb) x/3s 0x400684
0x400684:       "sample1"
0x40068c:       "sample2"
0x400694:       "true"

Print 16 characters from address 0x400684:

(gdb) x/16c 0x400684
0x400684:       115 's' 97 'a'  109 'm' 112 'p' 108 'l' 101 'e' 49 '1'  0 '\000'
0x40068c:       115 's' 97 'a'  109 'm' 112 'p' 108 'l' 101 'e' 50 '2'  0 '\000'

Print the addresses stored at p2, str[0] and str[1]:

(gdb) x/1a &p2
0x7fffffffe308: 0x400684
(gdb) x/1a &str[0]
0x7fffffffe310: 0x400684
(gdb) x/1a &str[1]
0x7fffffffe318: 0x40068c
Attie
  • 6,690
  • 2
  • 24
  • 34
1

The other questions have covered the why your strings are equal (in the == operator sense), here I want to address directly your question.

0x4005f8 is the address at which the string constant is stored. You can print it with the printf type conversion "%p", which expects a void* argument, your full statement would be:

printf("p2 = %p\n", (void*)p2);
printf("str[0] = %p\n", (void*)str[0]);

The casts to void* are not necessary with GCC, but you can include them to remove a warning. necessary because, when you pass a pointer as part of a variable argument list, the compiler does not implicitly convert it to void * as it would for a function that took a prototyped void * argument.

zwol
  • 135,547
  • 38
  • 252
  • 361
joH1
  • 555
  • 1
  • 9
  • 27
  • 1
    Your last sentence ruins this answer. – Bathsheba Jun 06 '19 at 10:34
  • @Bathsheba Not so much sir, given that `char *` and `void *` has same alignment requirement, in this case it's OK. – Sourav Ghosh Jun 06 '19 at 10:51
  • @SouravGhosh When passing pointers through a variable argument list, it is not enough for the actual argument type to have the same alignment and size as the callee's expected type; they must be "compatible types", which means "the same type after stripping top-level cv-qualifiers" in this case. See [N1570 7.16.1.1p2](http://port70.net/~nsz/c/c11/n1570.html#7.16.1.1p2). There is a special case at the end of that paragraph that allows `char *` to be passed to a user of `va_arg` expecting `void *`, and that was probably meant to apply to `printf`, but nothing says `printf` uses `va_arg`. – zwol Jun 06 '19 at 12:00