2

I have picked up this program from somewhere and am trying to understand it.

This line: s[j++] = s[i]; is the cause of crash. My understanding is that for the first time at least the program should not crash because j will be incremented later on. First time j and i's value will be 0.

So, this will be like s[0] = s[0];

Why should this crash?

#include <iostream>

using namespace std;


void squeeze(char a[], char c);

int main()
{
    squeeze("qwiert", 'i');

    return 0;
}

void squeeze(char s[], char c)
{
    int i, j;

    for (i = j = 0; s[i] != '\0'; i++)
    {
        if (s[i] != c)
        {
            std::cout << "\ni: " << s[i];
            s[j++] = s[i];
            std::cout << "\nj: " << s[j];

            std::cout << "\nj : " << j;
            exit(0);
        }
    }

    s[j] = '\0';
}

Output:

i: q

After this the program crashes.

I have put the exit statement to stop the program that point to find the segfault.

Aquarius_Girl
  • 21,790
  • 65
  • 230
  • 411
  • 1
    Could the address space reserved for s, be write protected? Because in main() it's defined like a constant and the compiler could put it into special memory regions. But it's just a guess. – KillPinguin Nov 26 '17 at 03:18
  • Yeah, that was the reason. @KillPinguin – Aquarius_Girl Nov 26 '17 at 03:20
  • 1
    When I compile I get the following warning: "ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]". Some compilers will reject this outright (technically all should) and give you an error. The program will compile with warnings, but warnings mean there is something not quite kosher and the program most likely not behave the way you expect. – user4581301 Nov 26 '17 at 03:22
  • 1
    @user4581301, there is actually nothing in the standard that mandates an implementation reject UB outright. While that might be a good feature, you probably should distinguish between what's *required* and what's useful. – paxdiablo Nov 26 '17 at 03:24
  • 1
    @paxdiablo string literal to `char*` conversion was formally banned in C++11, but yeah, I don't know what standard Aquarius_Girl is compiling for. – user4581301 Nov 26 '17 at 03:32
  • @user4581301, that same wording exists in C++03, though in `2.13.4`, it's been around for a long time. I'll update my answer to include that. – paxdiablo Nov 26 '17 at 03:59
  • @paxdiablo I can't find it either. I've had compiler errors and [have general commentary calling it fact](https://stackoverflow.com/questions/20944784/why-is-conversion-from-string-constant-to-char-valid-in-c-but-invalid-in-c), but I can't find an out-right ban in recent standards. Doesn't mean that 2003 had an exception that allowed the conversion that 2011 doesn't have, but I don't have access to anything earlier than C++2014. – user4581301 Nov 26 '17 at 04:15
  • I've also been digging through implicit conversion and initialization rules. – user4581301 Nov 26 '17 at 04:26
  • Found it in appendix C (diff.lex): `char* p = "abc"; // valid in C, invalid in C++` but reading the rules it references to make this call, I can't make the same connection. – user4581301 Nov 26 '17 at 04:40

3 Answers3

4

You are not permitted to change string literals such as with:

squeeze("qwiert", 'i');

This is covered in pretty much all iterations of the standard(a):

  • C++03 2.13.4.String literals [lex.string] /2;
  • C++11 2.14.5.String literals [lex.string] /12; and
  • C++14 2.14.5.String literals [lex.string] /13.

The same wording exists in each:

Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined. The effect of attempting to modify a string literal is undefined.

The wording has changed a little in the latest C++17 standard but amounts to much the same, currently C++17 5.13.5.String literals [lex.string] /16:

Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified. [Note: The effect of attempting to modify a string literal is undefined. - end note]

I suggest you try something like:

char str[] = "qwiert"; // Make a writable copy.
squeeze(str, 'i');     //   then fiddle with that.

(a) The ISO quotes in this answer actually provides an inkling as to why this is the case.

We haven't always had multi-gigabyte machines and it was often the case that certain steps had to be taken to optimise in early compilers (C mostly but carried forward into C++ due to it's initial implementations as a front end to C).

To that end, two strings with the same characters (or overlapping characters in certain ways such as "successful" and "unsuccessful") could share the same memory to reduce space.

That, of course, meant you couldn't change one without affecting the other, which is why this rule was put in place.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
4

You're passing the string constant "qwiert" to the squeeze function. This function then attempts the modify that string constant, which is illegal. This causes the core dump.

For this to work, you need to pass in an array:

int main()
{
    char str[] = "qwiert";
    squeeze(str, 'i');

    return 0;
}
dbush
  • 205,898
  • 23
  • 218
  • 273
3

The other answers have pointed out the problem and how to resolve it.

I want to point out that you can detect such errors by increasing the warning level on your compiler.

With g++ -Wall, I get the following warning message:

socc.cc: In function ‘int main()’:
socc.cc:10:26: warning: ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings]
     squeeze("qwiert", 'i');
R Sahu
  • 204,454
  • 14
  • 159
  • 270