0

I wrote the following program to search of a particular string in a given array of string. I made an error in the search function and wrote i-- instead of i++.

#include <iostream>
#include <string>
using namespace std;

int search(string S[], int pos, string s)
{
    for(int i=0; i<pos; i--) {
        cout << i << " : " << S[i] << "\n";
        if (S[i] == s) {
            cout << "Inside Return ->\n";
            cout << i << " / " << S[i] << " / " << s << "\n";
            return i;
        }
    }
    return -1;
}

int main()
{
    string S[] = {"abc", "def", "pqr", "xyz"};
    string s = "def";
    cout << search(S,2,s) << "\n";
    return 0;
}

Logically the loop is an infinite loop and should not stop but what I observed was that the if condition was true for each search and the function returned -1.

I printed the values and noticed that the value of S[-1] is always same as the third argument passed to the function (the string to be searched) due to which the loop was returning -1 every time.

Is this something that g++ is doing or is it related to the way memory is allocated for the formal arguments of the function?

Output of the above code -

 0 : abc
-1 : def
Inside Return ->
-1 / def / def

PS - I am using g++ (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0

Edit -I understand that g++ doesn't check for bounds but I was intrigued by the fact that the values of S[-1] was always the same as s. I was wondering if there are any possible theories for this

Saurabh
  • 75
  • 7
  • 3
    Possible duplicate of [Accessing an array out of bounds gives no error, why?](https://stackoverflow.com/questions/1239938/accessing-an-array-out-of-bounds-gives-no-error-why) – Algirdas Preidžius Oct 24 '18 at 12:33
  • 1
    Yep, this is a duplicate. But also very interesting, as you probably access the function call stack... Please read the answer of @AlgirdasPreidžius linked question for an explanation of undefined behaviour :) – R. Joiny Oct 24 '18 at 12:41
  • 1
    The usual question: how are your readings *not* garbage? They're not values from the array after all... – Quentin Oct 24 '18 at 12:42
  • 5
    @Quentin One person's garbage is another person's treasure! – rodrigo Oct 24 '18 at 12:43
  • 2
    "_I understand that g++ doesn't check for bounds_" It's not just g++. No compiler is mandated to check for bounds, by C++ standard. "_but I was intrigued by the fact that the values of S[-1] was always the same as s._" Undefined behavior is undefined. It may as well be so in this particular compiler version you are using, but it also can format your hard drive, when you update the compiler. – Algirdas Preidžius Oct 24 '18 at 12:43
  • I get it, thanks for explaining – Saurabh Oct 24 '18 at 12:45
  • 4
    Try printing out the address of each string in addition to its value. I suspect `s` is sitting to the “left” of `S[0]` in memory. – Ben Oct 24 '18 at 12:46
  • 1
    This actually explains the problem, the address of S[0] and s are - 0x7ffdf6346350 0x7ffdf6346330 respectively. – Saurabh Oct 24 '18 at 12:50
  • More details: https://en.cppreference.com/w/cpp/language/ub – balki Oct 24 '18 at 14:37
  • Indexing arrays out of bounds is UB! The UB can change from compiler to operating system to hardware, etc. – Francis Cugler Oct 24 '18 at 21:47

1 Answers1

7

Access out of bounds is undefined behaviour.

Undefined behaviour reads is not "garbage" or "segfault", it is literally anything. The read could time travel and make code earlier in the program behave differently. The behaviour of the program, from start to finish, it completely unspecified by the C++ standard whenever any undefined behaviour happens anywhere.

In this case, naive assembly and the ABI tells you that arguments on the "stack" at run time are located adjacent to things like the arguments to the function.

So a naive rewriting of your code into assembly results in negative indexes reading from the arguments to the function.

But, a whole myriad of completely innocuous, common and safe alternative interpretations of your program as machine code, starting with inline and going far away from there, make this not happen.

When compiling without LTO or over dynamic library boundaries, you can have some small amount of confidence that the compiler's published ABI will be used to make the call; any assumption elsewhere is dangerously bad. And if you are compiling without LTO and relying on it, it now means that you have to audit every build of your code from now until eternity or risk a bug showing up with no apparent cause long from now.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524