2

In our project we often use a construct like this (simplified for clarity, we actually use a more safe to use version):

struct Info
{
    Info(int x, int y) : m_x(x), m_y(y)
    {}

    int m_x;
    int m_y;
};

struct Data
{
    static const Info M_INFO_COLLECTION[3];
};

const Info Data::M_INFO_COLLECTION[] =  // global-constructors warning
{
    Info(1, 2),
    Info(10, 9),
    Info(0, 1)
};

M_INFO_COLLECTION can contain a large number of data points. The initialization part resides in the cpp file, which is often code-generated.

Now this structure gives us a fair amount of global-constructors-warnings in clang across our code base. I've read in a blog post that using warnings in the -Weverything group is a bad idea for nightly builds and I agree, and even the clang docdoes not recommend to use it.

Since the decision to switch off the warning is out of my hands, I could use a helpful trick to get rid of the warning (and a potential static initialization order fiasco), by converting the static member into a function initializing and returning a local static variable.

However, since our project does not use dynamically allocated memory as a rule, the original idea has to be used without a pointer, which can lead to deinitialization problems when my Data class is used by other objects in a weird way.

So, long story short: The global-constructors warning points to a piece of code that I can review as safe, since I know what the Data class does. I could get rid of it using a workaround that can lead to problems if other classes use Data in a particular way, but this won't generate a warning. My conclusion is that I would be better off just leaving the code as it is and ignore the warning.

So now I am stuck with a bunch of warnings, that may in some cases point to a SIOF and which I would like to address, but which get buried unter a mountain of warnings that I deliberately do not want to fix, because the fix would actually make things worse.

This brings me to my actual question: Is clang too strict in its interpretation of the warning? From my limited compiler understanding, shouldn't it be possible for the compiler to realize that in this particular case, the static member M_INFO_COLLECTION cannot possibly lead to a SIOF, since all its dependencies are non-static?

I played around with this issue a bit and even this piece of code gets the warning:

//at global scope

int get1() 
{
    return 1;
}

int i = get1(); // global-constructors warning

Whereas this works fine, as I would expect:

constexpr int get1() 
{
    return 1;
}

int i = 1;  // no warning
int j = get1(); // no warning

This looks fairly trivial to me. Am I missing something or should clang be able to suppress the warning for this example (and possibly my original example above as well)?

Cerno
  • 751
  • 4
  • 14
  • Possible dupe https://stackoverflow.com/questions/15708411/how-to-deal-with-global-constructor-warning-in-clang? – cigien Oct 25 '20 at 23:12
  • Disable the warning `#pragma clang diagnostic ignored "-Wglobal-constructors"` on a needed basis in the source code itself. For my projects, I disable the "noisy cricket" warnings as part of the project, but if that's not an option you can do it on a case-by-case basis in the code. – Eljay Oct 25 '20 at 23:27
  • @cigien I already found that question and it lead me to the idea of using a function with a local static variable. In this question I wanted to address why clang is not able to deal with the obviously unproblematic cases. I agree that both questions are closely related, but there are slight differences. – Cerno Oct 26 '20 at 00:53
  • @Eljay That's a good idea. I'll check whether I can get that through with our department, although my preference would be to switch if off globally. – Cerno Oct 26 '20 at 00:54
  • Ok, sounds good. I'll leave the comment in, so that plus your response should ensure it doesn't get closed. The titles at least make them seem very similar :) – cigien Oct 26 '20 at 01:02

1 Answers1

3

The problem is that it is not constant initialized. This means that M_INFO_COLLECTION may be zero-initialized and then dynamically initialized at run time.

Your code generates assembley to set M_INFO_COLLECTION dynamically because of the "global constructor" (non-constant initialization): https://godbolt.org/z/45x6q6

An example where this leads to unexpected behaviour:

// data.h
struct Info
{
    Info(int x, int y) : m_x(x), m_y(y)
    {}

    int m_x;
    int m_y;
};

struct Data
{
    static const Info M_INFO_COLLECTION[3];
};


// data.cpp
#include "data.h"

const Info Data::M_INFO_COLLECTION[] =
{
    Info(1, 2),
    Info(10, 9),
    Info(0, 1)
};


// main.cpp
#include "data.h"

const int first = Data::M_INFO_COLLECTION[0].m_x;

int main() {
    return first;
}

Now if you compile main.cpp before data.cpp, first may access Info out of it's lifetime. In practice, this UB just makes first 0.

E.g.,

$ clang++ -I. main.cpp data.cpp -o test
$ ./test ; echo $?
0
$ clang++ -I. data.cpp main.cpp -o test
$ ./test ; echo $?
1

Of course, this is undefined behaviour. At -O1, this issue disappears, and clang behaves as if M_INFO_COLLECTION is constant initialized (as-if it reordered the dynamic initialization to before first's dynamic initialization (and every other dynamic initialization), which it is allowed to do).

The fix to this is to not use global constructors. If your static storage duration variable is able to be constant initialized, make the relevant functions/constructors constexpr.

If you are not able to add constexprs or have to use a non-constant initialized variable, then you can resolve the static initialization order fiasco without dynamic memory using placement-new:

// data.h
struct Info
{
    Info(int x, int y) : m_x(x), m_y(y)
    {}

    int m_x;
    int m_y;
};

struct Data
{
    static auto M_INFO_COLLECTION() -> const Info(&)[3];
    static const Info& M_ZERO();
};

// data.cpp
#include "data.h"

#include <new>

auto Data::M_INFO_COLLECTION() -> const Info(&)[3] {
    // Need proxy type for array reference
    struct holder {
        const Info value[3];
    };
    alignas(holder) static char storage[sizeof(holder)];
    static auto& data = (new (storage) holder{{
        Info(1, 2),
        Info(10, 9),
        Info(0, 1)
    }})->value;
    return data;
}

const Info& Data::M_ZERO() {
    // Much easier for non-array types
    alignas(Info) static char storage[sizeof(Info)];
    static const Info& result = *new (storage) Info(0, 0);
    return result;
}

Though this does have minor runtime overhead for each access, especially the very first, compared to regular static storage duration variables. It should be faster than the new T(...) trick, since it doesn't call a memory allocation operator.


In short, it's probably best to add constexpr to be able to constant initialize your static storage duration variables.

Artyer
  • 31,034
  • 3
  • 47
  • 75
  • You raise a few very good points, thank you. I missed the part where another globally static object may access `Data` because I mistakenly thought that this call would be caught by the compiler (which it doesn't), so I now see the reason for the warning more clearly. Even though we generally don't set access our static objects from other static objects, this could happen and I think it would be a good idea to be safe here. – Cerno Oct 26 '20 at 00:59
  • Your suggestion about `constexpr` sounds very exciting and I learned something about the usage of `constexpr`. I didn't know I could make an object `constexpr` if I made the class's constructor `constexpr` (and possibly fulfull some other requirements). I'll check whether I can make that work for our code. It sounds pretty promising at first glance. Thank you. – Cerno Oct 26 '20 at 01:10
  • The idea of using a placement new had crossed my mind but I thought it would not solve the SIOF since it wasn't dynamic, so I didn't try it out, and I probably wouldn't have got to your solution of storing the data bytes individually. I have to admit, I don't fully understand why your solution does not run into the deinitialization problem I linked to, nor why the isocpp faq does not mention it. But if it does work, it's a cool solution, even if it is a bit unwieldy. – Cerno Oct 26 '20 at 01:18
  • 1
    @Cerno Placement new and regular `new T()` both have dynamic storage duration. We don't call `delete` or `~T()` so there is no issue using after destruction, since it is never destroyed – Artyer Oct 26 '20 at 01:28
  • Thanks for the clarification. That's a pretty cool idea. Do you have any idea why isocpp does not mention it? Did you just make it up or is it a known but more obscure pattern? I haven't encountered it anywhere before. – Cerno Oct 26 '20 at 01:31
  • One final question: I took a closer look at your `placement new` code and realized that since we are already wrapping our arrays in a structure similar to your `holder` class (for boundary checks), everything gets a little easier. My question however is why you need to define `storage` as bytes. I was able to compile a version of your suggestion where storage is simply defined as `static holder storage;`. It compiles and seems to work. What benefit do you see in using a `char` array? – Cerno Oct 26 '20 at 01:57
  • 1
    @Cerno If you have `static holder storage;` you will call the destructor for `holder` including it's member array at the end of the program (possibly before some other destructor uses it), running into the whole unordered statics issue again. Byte arrays are the default thing to placement new on since they doesn't have destructors and `sizeof(char) == 1` – Artyer Oct 26 '20 at 02:09
  • Thanks! I didn't know that but it makes sense I guess. I somehow thought that as soon as we connect the `placement new` to the static `storage`, it would inherit the placement new's dynamic lifetime and remain constructed until the very end of the program (unless we destruct it manually). Isn't it a bit weird that I can use `placement new` to fill data into existing memory and then allow the memory to be deconstructed before the lifetime of the pointer ends? Sorry for my ignorance if I'm getting this wrong, but this is a level of C++ that exceeds my skills. – Cerno Oct 26 '20 at 02:43