3

Let's say I have a struct that keeps track of a type using a const char*:

struct Foo {
  const char* type;
}

Suppose I only ever assign this value using a string literal throughout my program:

Foo bar;
bar.type = "TypeA";

Foo baz;
baz.type = "TypeB";

Is it safe to compare this value using a regular == as opposed to a strcmp?

if (bar.type == baz.type) {
  printf("Same\n");
} else {
  printf("Different\n");
}

I would like to do this for performance reasons.

David Callanan
  • 5,601
  • 7
  • 63
  • 105
  • 7
    that will compare addresses, and not strings – BЈовић Jun 10 '22 at 11:41
  • 1
    if you worry about performance for comparing character by character consider to use enums rather than strings – 463035818_is_not_an_ai Jun 10 '22 at 11:50
  • 4
    Define "safe". The code won't do nasty things, it just won't give the answer you might expect. – Pete Becker Jun 10 '22 at 11:51
  • @463035818_is_not_a_number Yes I could do that, but I'll need the string anyway, so I figured it might not be worth the hassle to add code for converting between enum and char* all the time. – David Callanan Jun 10 '22 at 12:34
  • If you need the string and also can limit your strings to a range of predefined strings, you might introduce global const arrays holding your strings: `const char TypeA_String[]="TypeA";` and use that whenever you need `"TypeA"` instead of using a string literal. That would force the compiler to use same address. That won't work if you also need to handle strings that are not string literals. – Gerhardh Jun 10 '22 at 12:58
  • 1
    On Clang and GCC, there is an option `-fmerge-constants` that normally merge identical string literals to the same address. I used it as an optimization for string internalization. I noticed that sometimes it doesn't work: when *AddressSanitizer* is enabled, and also on Android NDK builds. – prapin Jun 10 '22 at 13:50

2 Answers2

10

There are 3 situations that can happen

The pointers point to the same space in memory

bar.type = "foobar"; // `bar.type` holds `0xdeadbeef` which holds `"foobar"`
baz.type = "foobar"; // `baz.type` holds `0xdeadbeef` which holds `"foobar"`
if (bar.type == baz.type) { /* true positive */ }

The pointers point to different places in memory, but the memory contents there are the same

bar.type = "foobar"; // `bar.type` holds `0xdeadbeef` which holds `"foobar"`
baz.type = "foobar"; // `baz.type` holds `0xdeadc0ff` which holds `"foobar"`
if (bar.type == baz.type) { /* false negative */ }

The pointers point to different memory areas and those areas have different content

bar.type = "foobar"; // `bar.type` holds `0xdeadbeef` which holds `"foobar"`
baz.type = "bar"; // `baz.type` holds `0xdeadbef2` which holds `"bar"`
if (bar.type == baz.type) { /* true negative */ }

You cannot have a false positive in this situation.

pmg
  • 106,608
  • 13
  • 126
  • 198
  • 1
    For the "false negative" case might want to add a reference to [Two string literals have the same pointer value?](https://stackoverflow.com/questions/13515561/two-string-literals-have-the-same-pointer-value) – rustyx Jun 10 '22 at 11:55
8

Is it safe to compare this value using a regular == as opposed to a strcmp?

No. It isn't safe in the sense that two string literals - even with same content - are not guaranteed to have the same storage address, and thus may compare different.

You can compare the address initially and only compare content if the address differs. You can return early if the address matches.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • Interesting, thank you. Although it is not "guaranteed", do the addresses generally match up in the real-world? Would this be compiler-specific perhaps? Or is it common for the addresses not to match up on all compilers? – David Callanan Jun 10 '22 at 12:37
  • 4
    @DavidCallanan within a single translation unit, it's very likely to be the same address. Between shared libraries, it may be unlikely to be the same address. – eerorika Jun 10 '22 at 12:39