-1

For the code(Full demo) like:

#include <iostream>

struct A
{
    int a;
    char ch[1];
};

int main() 
{
    volatile A *test = new A;
    test->a = 1;
    test->ch[0] = 'a';
    test->ch[1] = 'b';
    test->ch[2] = 'c';
    test->ch[3] = '\0';
    std::cout << sizeof(*test) << std::endl
              << test->ch[0] << std::endl;
}

I need to ignore the compilation warning like

warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]

which is raised by gcc8.2 compiler:

g++ -O2 -Warray-bounds=2 main.cpp

A method to ignore this warning is to use pointer to operate the four bytes characters like:

#include <iostream>

struct A
{
    int a;
    char ch[1];
};

int main() 
{
    volatile A *test = new A;
    test->a = 1;

    // Use pointer to avoid the warning          
    volatile char *ptr = test->ch;
    *ptr = 'a';
    *(ptr + 1) = 'b';
    *(ptr + 2) = 'c';
    *(ptr + 3) = '\0';
    std::cout << sizeof(*test) << std::endl
              << test->ch[0] << std::endl;
}

But I can not figure out why that works to use pointer instead of subscript array. Is it because pointer do not have boundary checking for which it point to? Can anyone explain that?

Thanks.

Background:

  1. Due to padding and alignment of memory for struct, though ch[1]-ch[3] in struct A is out of declared array boundary, it is still not overflow from memory view
  2. Why don't we just declare the ch to ch[4] in struct A to avoid this warning?
    Answer:
    struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
leiyc
  • 903
  • 11
  • 23
  • 3
    It isn't clear what is your question exactly. Both programs have undefined behaviour. The compiler doesn't have to warn you about every instance of undefined behaviour, tis is clearly impossible. – n. m. could be an AI Aug 05 '19 at 03:56
  • 2
    "Due to padding and alignment of memory for struct, though `ch[1]` - `ch[3]` in struct A is out of declared array boundary, it is still not overflow for memory view, so we want to ignore this warning." **No.** C++ is not like that. You are triggering *undefined behavior*. – L. F. Aug 05 '19 at 04:01
  • @L.F., it works in the [demo](http://coliru.stacked-crooked.com/a/c7225b9f784f7459), the size of `struct A` is 8 bytes with `int` 4 bytes, 1 bytes for `ch[1]`, and 3 bytes for padding, so I think we can use the 3 bytes for padding to store some info. – leiyc Aug 05 '19 at 04:20
  • @leiyc I have explained in my answer what "works" means in this case. – L. F. Aug 05 '19 at 04:22
  • This behavior doesn't surprise me. But why code something that specifically invokes undefined behavior? Are you just experimenting to see what might happen if someone wrote stuff like this by accident? – doug Aug 05 '19 at 04:26
  • @doug, I just want to know gcc compiler behavior for this case. Maybe, it is because compiler does not check the valid memory boundary which pointer points to. so it will not give out that warning. – leiyc Aug 05 '19 at 05:31
  • 2
    I haven't run across a C++ compiler that checks pointers at all. Unlike many other languages, C and C++ are close to the metal and generally apply few constraints. It's up to the programmer to avoid undefined behavior and it can occur in many different areas. For instance signed integer overflow is UB. Even with two's complement ints. This sometimes results in surprises. Especially with optimized code since the compiler may assume overflow doesn't happen. – doug Aug 05 '19 at 06:05
  • @doug I have. All the time, my GCC does. – Antti Haapala -- Слава Україні Aug 05 '19 at 08:15

3 Answers3

3

Due to padding and alignment of memory for struct, though ch[1]ch[3] in struct A is out of declared array boundary, it is still not overflow for memory view, so we want to ignore this warning.

C++ does not work the way you think it does. You are triggering undefined behavior. When your code triggers undefined behavior, the C++ standard places no requirement on its behavior. A version of GCC attempts to start some video games when certain kind of undefined behavior is encountered. Anthony Williams also knows at least one case where a particular instance of undefined behavior caused someone's monitor to catch on fire. (C++ Concurrency in Action, page 106) Your code may appear to be working at this very time and situation, but that is just an instance of undefined behavior and you cannot count on it. See Undefined, unspecified and implementation-defined behavior.

The correct way to suppress this warning is to write correct C++ code with well-defined behavior. In your case, declaring ch as char ch[4]; solves the problem.


The standard specifies this as undefined behavior in [expr.add]/4:

When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.

  • If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.

  • Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]),78 the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i + j of x if 0 ≤ i + j ≤ n and the expression P - J points to the (possibly-hypothetical) array element i − j of x if 0 ≤ i − j ≤ n.

  • Otherwise, the behavior is undefined.

78) An object that is not an array element is considered to belong to a single-element array for this purpose; see [expr.unary.op]. A pointer past the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical array element n for this purpose; see [basic.compound].

L. F.
  • 19,445
  • 8
  • 48
  • 82
  • 1
    Yes, the correct code should declare `ch` to `ch[4]`, but in our real application, we can not change the `struct A`, the `struct A` is generated by other script that if we do not know how long a array is, we declare it with one member...... – leiyc Aug 05 '19 at 04:27
  • 2
    @leiyc Well, then you can only store one character in `ch`. Don't try to hack. – L. F. Aug 05 '19 at 04:28
  • 2
    @leiyc It is ultimately your choice to rely on UB or not. When you do choose to rely on UB though, you void your warranty, and from that point on you are on your own. – n. m. could be an AI Aug 05 '19 at 05:20
  • @n.m. yes, it is. – leiyc Aug 05 '19 at 05:24
  • @leiyc What you are trying to use here is a variable sized array/struct. A different definition of this would use `char ch[]`. The `char ch[1];` is a workaround for compilers that don't understand `char ch[]` and the cause for the warning. Unfortunately those kinds of structs are not compatible with new. You have to malloc them to the right size, allocating a bigger memory block with enough space for the char array you actually end up storing in the struct. Time to upgrade the code to C++ and use std::string, std::array or std::vector. – Goswin von Brederlow Mar 07 '22 at 15:12
  • @GoswinvonBrederlow, yes, it is a case of variable sized array/struct, and should use C-style malloc for the memory allocation. This structure may be old style of message construction during process communication. In C++ area, it seems we can use protocol buffer instead. – leiyc Mar 08 '22 at 03:23
1

I want to avoid the warning like

warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]

Well, it is probably better to fix the warning, not just avoid it.

The warning is actually telling you something: what you are doing is undefined behavior. Undefined behavior is really bad (it allows your program to literally anything!) and should be fixed.

Let's look at your struct again:

struct A
{
    int a;
    char ch[1];
};

In C++, your array has only one element in it. The standard only guarantees array elements of 0 through N-1, where N is the size of the array:

[dcl.array]

...If the value of the constant expression is N, the array has N elements numbered 0 to N-1...

So ch only has the elements 0 through 1-1, or elements 0 through 0, which is just element 0. That means accessing ch[1], ch[2] overruns the buffer, which is undefined behavior.

Due to padding and alignment of memory for struct, though ch1-ch3 in struct A is out of declared array boundary, it is still not overflow for memory view, so we want to ignore this warning.

Umm, if you say so. The example you gave only allocated 1 A, so as far as we know, there is still only space for the 1 character. If you do allocate more than 1 A at a time in your real program, then I suppose this is possible. But that's still probably not a good thing to do. Especially since you might run into int a of the next A if you're not careful.

A solution to ignore this warning is to use pointer...But I can not figure out why that works. Is it because pointer do not have boundary checking for which it point?

Probably. That would be my guess too. Pointers can point to anything (including destroyed data or even nothing at all!), so the compiler probably won't check it for you. The compiler may not even have a way of knowing whether the memory you point to is valid or not (or may just not care), and, thus, may not even have a way to warn you, much less will warn you. Its only choice is to trust you, so I'm guessing that's why there's no warning.

Why don't we just declare the ch to ch4 in struct A to avoid this warning?

Side issue: actually std::string is probably a better choice here if you don't know how many characters you want to store in here ahead of time--assuming it's different for every instance of A. Anyway, moving on:

Why don't we just declare the ch to ch4 in struct A to avoid this warning?

Answer:

struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.

I'm not sure I understand your design principle completely, but it sounds like std::vector might be a better option. Then, size is kept track of automatically by the std::vector, and you know that everything is stored in ch. To access it, it would be something like:

myVec[i].ch[0]

I don't know all your constraints for your situation, but it sounds like a better solution instead of walking the line around undefined behavior. But that's just me.

Finally, I should mention that if you are still really interested in ignoring our advice, then I should mention that you still have the option to turn off the warning, but again, I'd advise not doing that. It'd be better to fix A if you can, or get a better use strategy if you can't.

  • We need use dynamic arrays in our app, but there is no way to do so.( can not use std::vector). So we use this struct to do so. – leiyc Aug 05 '19 at 08:44
  • @leiyc This isn’t a way to do so either. – L. F. Aug 19 '19 at 15:47
  • so we shall manage the potential issues behind ourselves if we choose pointer to ignore the warning... – leiyc Aug 20 '19 at 02:40
0

There really is no way to work with this cleanly in C++ and iirc the type (a dynamically sized struct) isn't actually properly formed in C++. But you can work with it because compilers still try to preserve compatibility with C. So it works in practice.

You can't have a value of the struct, only references or pointers to it. And they must be allocated by malloc() and released by free(). You can't use new and delete. Below I show you a way that only allows you to allocate pointers to variable sized structs given the desired payload size. This is the tricky bit as sizeof(Buf) will be 16 (and not 8) because Buf::buf must have a unique address. So here we go:

#include <cstddef>
#include <cstdint>
#include <stdlib.h>
#include <new>
#include <iostream>
#include <memory>

struct Buf {
    size_t size {0};
    char buf[];
    [[nodiscard]]
    static Buf * alloc(size_t size) {
        void *mem = malloc(offsetof(Buf, buf) + size);
        if (!mem) throw std::bad_alloc();
        return std::construct_at(reinterpret_cast<Buf*>(mem), AllocGuard{}, size);
    }
    private:
    class AllocGuard {};
    public:
    Buf(AllocGuard, size_t size_) noexcept : size(size_) {}
};

int main() {
    Buf *buf = Buf::alloc(13);
    std::cout << "buffer has size " << buf->size << std::endl;
}

You should delete or implement the assign/copy/move constructors and operators as desired. A another good idea would be to use std::uniq_ptr or std::shared_ptr with a Deleter that calls free() instead of returning a naked pointer. But I leave that as exercise to the reader.

Goswin von Brederlow
  • 11,875
  • 2
  • 24
  • 42