2

I suspect I have found a g++ optimization bug which relates to dereferencing arrays in objects (structs) with negative indexes.

In the below, Node is structure which has an array preceding it (in my real code it is a node for a Skip List where both its number of pointers and the size of its data packet are variable and unknown to the underlying SkipList code, hence a decision to put the pointers before the object reference and the data packet - here a long - after the object).

#include <iostream>
#include <stdlib.h>

class Node {
public:
  unsigned int ptr[1]; // really an array going backwards
  long datum; // This seems to be necessary for the bug to surface                                                                                                                                       

};

class NodeList {
public:
  Node* hdr;
  NodeList() {
    void* p_v = malloc( sizeof(Node) + 32 * sizeof( unsigned int ) );
    hdr = (Node*)((char*)p_v + 32 * sizeof(unsigned int));
    hdr->ptr[-5]=100;
  }
  void setNodes() {
    int nn=0;
    while( rand() > 20 && nn<9 ) {
      nn++;
    }
    if( nn < 9 ) {
      nn = 9;
    }
    // It is a logical truth that nn = 9 here                                                                                                                                                            
    //nn = 9; // IF THIS IS UNCOMMENTED EVERYTHING WORKS!                                                                                                                                                
    std::cout << "nn=" << nn << " (should be 9) " << std::endl;
    int ctr = 0;
    for( int i=0; i<=nn; i++ ) {
       ctr++;
       hdr->ptr[-i]=0;
    }
    std::cout << "ctr was incremented " << ctr << " times (should be 10) and hdr->ptr[-5] = " << hdr->ptr[-5] << " (should be 0)\n";
  }
};

int main( int argc, char** argv ) {
    NodeList list;
    list.setNodes();
 }

Expected output has ctr being incremented 10 times and hdr->ptr[-5] being 0. Optimized code just goes through the loop once (ie does not loop), and leaves ptr->hdr[-5] as 100. This is a bug.

-fno-aggressive-loop-optimizations seems to fix it but clearly would be better if the output code was correct.

I am putting this here to (a) get verification that this is a bug since I am a newbie here and this is my first question, (b) ask anyone knowledgeable in the gcc dev community what is to be done about it (eg. how should I report it, and whether it has been fixed in later releases), and (c) to allow folk who have experienced this most frustrating and time-consuming issue on CentOS 7 (or any other distro with 4.8) to see confirmation that they have hit a bug from a fellow sufferer!

galois99
  • 21
  • 1
  • Isn't using negative index for (C-style) array an undefined behaviour ? – nefas May 30 '17 at 07:47
  • No, `arr[n]` is defined to be `*(arr+n)` for *all* n (positive and negative). It is of course up to the implementation to ensure the reference is to properly allocated memory. See [this question](https://stackoverflow.com/q/3473675/8084766) – galois99 May 30 '17 at 08:04
  • [this question](https://stackoverflow.com/questions/3473675/are-negative-array-indexes-allowed-in-c) might have the answer you seek. – nefas May 30 '17 at 08:04
  • 2
    Just plain indexing out of bound, it's UB. Array is declared as having a size of `1` so any array index other than [0] is undefined behaviour. I suspect you may be trying to mix defined `C` behaviour with undefined `C++` behaviour. See __Access out of bounds__ in http://en.cppreference.com/w/cpp/language/ub – Richard Critten May 30 '17 at 08:05
  • @RichardCritten Please make this comment into an answer. – Walter May 30 '17 at 08:17
  • Right you are Richard! So should I declare the first element of `Node` as a plain `unsigned int ptr_0` and reference members before it using something like `unsigned int* ptr = &hdr->ptr_0` then use `ptr[-5]`, for example? – galois99 May 30 '17 at 08:27

1 Answers1

1

One must be careful here regarding the meaning of 'array'. An array is anything defined as

some_type arr[number];

but not something like

some_type*ptr = some_address;

Negative indices are no different to positive indices and ptr[n] is interpreted as *(ptr+n) (you can even use n[ptr] in C). Thus when indexing, pointer arithmetic is used.

It is undefined behaviour (UB) in both C and C++ to access elements outside the bounds of an array, no matter how this is achieved. For example

some_type arr[10];
some_type*ptr = arr+5;
some_type foo = ptr[-4];    // ok, access to arr[1]
some_type bar = ptr[-6];    // UB, out-of-bound access
some_type val = arr[-1];    // UB, out-of-bound access
Walter
  • 44,150
  • 20
  • 113
  • 196
  • What about something like `some_type var1;` declared somewhere. Then somewhere later `some_type* var_p = &var1;` Now the question I have is whether referencing say `var_p[1]` or `var_p[-1]` is UB? If so, how would you do the trick of referencing an array prior to an object without invoking UB? – galois99 May 30 '17 at 13:08
  • @galois99 That's quite a different question from your original post, since no array is involved. I'm not sure, but would strongly advise against this sort of thing, unless you know very well what you're doing (which you don't, for otherwise you would not have asked). – Walter May 30 '17 at 13:20