13

There are several questions on Stack Overflow along the lines of "why can't I initialise static data members in-class in C++". Most answers quote from the standard telling you what you can do; those that attempt to answer why usually point to a link (now seemingly unavailable) [EDIT: actually it is available, see below] on Stroustrup's site where he states that allowing in-class initialisation of static members would violate the One Definition Rule (ODR).

However, these answers seem overly simplistic. The compiler is perfectly able to sort out ODR problems when it wants to. For example, consider the following in a C++ header:

struct SimpleExample
{
    static const std::string str;
};

// This must appear in exactly one TU, not a header, or else violate the ODR
// const std::string SimpleExample::str = "String 1";

template <int I>
struct TemplateExample
{
    static const std::string str;
};

// But this is fine in a header
template <int I>
const std::string TemplateExample<I>::str = "String 2";

If I instantiate TemplateExample<0> in multiple translation units, compiler/linker magic kicks in and I get exactly one copy of TemplateExample<0>::str in the final executable.

So my question is, given that it's obviously possible for the compiler to solve the ODR problem for static members of template classes, why can it not do this for non-template classes too?

EDIT: The Stroustrup FAQ response is available here. The relevant sentence is:

However, to avoid complicated linker rules, C++ requires that every object has a unique definition. That rule would be broken if C++ allowed in-class definition of entities that needed to be stored in memory as objects

It seems however that those "complicated linker rules" do exist and are used in the template case, so why not in the simple case too?

iammilind
  • 68,093
  • 33
  • 169
  • 336
Tristan Brindle
  • 16,281
  • 4
  • 39
  • 82
  • C++11 relaxes this restriction. You can do in-class initialization with constant expressions. – n. m. could be an AI Sep 20 '13 at 06:27
  • Yes, but my understanding is that even then it requires a definition (with no initialiser) to be present at namespace scope in exactly one translation unit; however in the template case it may appear in a header and thus in multiple TUs. My question is why the symbol coalescing magic used in the template case can't also be used for ordinary, non-template classes. – Tristan Brindle Sep 20 '13 at 06:44
  • 1
    Templates were not part of the original C++ language. – Raymond Chen Sep 20 '13 at 07:05
  • [Almost Duplicate](http://programmers.stackexchange.com/questions/145299/why-the-static-data-members-have-to-be-defined-outside-the-class-separately-in-c). – iammilind Sep 20 '13 at 08:04
  • C++17 allows inline initialization of static data members (even for non-integer types): `inline static int x[] = {1, 2, 3};`. See en.cppreference.com/w/cpp/language/static#Static_data_members – Vladimir Reshetnikov Feb 14 '18 at 23:24

2 Answers2

1

The C++ Build structure used to be quite simple.

The compiler built object files which normally contained one class implementation. The linker then joined all of the object files together into the executable file.

The One Definition Rule refers to the requirement that each variable (and function) used in the executable only appears in one object file created by the compiler. All other object files simply have a external prototype references to the variable/function.

Templates where a very late addition to C++, and require that all the template implementation details are available during each compilation of every object, so that the compiler can do all of it's optimizations - this involves lots of inlining and even more name mangling.

I hope this answers your question, because it the reason for the ODR rule, and why it doesn't affect templates. Because the linker has almost nothing to do with templates, they are all managed by the compiler. Excluding the case were use template specialization to push an entire template expansion into one object file, so it can be used in other object files, if they only see the prototypes for the template.

Edit:

Back in the olden days linkers frequently linked object files created with different languages. It was common to link ASM and C, and even after C++ some of that code was still used and that absolutely necessitates the ODR. Just because your project is only linking C++ files doesn't mean that's all a linker can do, and so it won't be changed because most projects are now solely C++. Even now many device drivers use the linker according to it's more original intention.

Answer:

It seems however that those "complicated linker rules" do exist and are used in the template case, so why not in the simple case too?

The compiler manages the template cases, and just creates weak linker references.

The linker has nothing to do with templates, they are templates used by the compiler to create code it passes to the linker.

So the linker rules are not effected by templates, but the linker rules are still important because ODR is a requirement of ASM and C, which the linker still links, and people other than you do still actually use.

Strings
  • 1,674
  • 10
  • 16
  • 1
    "the linker has almost nothing to do with templates" ― this is not entirely true. The linker merges duplicate functions and data if the compiler marks them as mergeable. Template instantiations (functions and static data) are so marked, why non-template static members are not? – n. m. could be an AI Sep 20 '13 at 08:54
  • @n.m. Precisely, well put – Tristan Brindle Sep 20 '13 at 09:18
  • What exact kind of old code would be broken and in what way? I don't see any reason for breakage. – n. m. could be an AI Sep 20 '13 at 10:31
  • I'm not sure what *your* point is. *No one is going to change C++ compilation model* — no change of "the model", whatever it is, is needed. Current linkers are quite capable of doing what we want. We only need to let the compiler emit the right kind of symbols, which they already do in the case of templates. *If you don't know how to abuse ODR* — Abuse is my middle name, I'm not asking you how to do that. I'm asking what code would be broken under a proposed rule and how. You seem to be quite confident for some reason that old code would be broken, and I fail to see that reason. – n. m. could be an AI Sep 20 '13 at 11:00
  • @strings: The attitude is unnecessary, I was merely asking why this particular C++ facility has the restrictions it does, given that it doesn't (at least to a non committee member) appear to be necessary. I haven't accepted your answer for the reasons @n.m. gave: namely I was after a specific reason why non-template static data member symbols aren't mandated by the standard to be marked as *weak*, as templated static data members are. Breaking old code seems unlikely, given that there's really no way for me to refer to `SimpleExample::str` from Fortran for example. – Tristan Brindle Sep 20 '13 at 15:59
  • **Changing strong to weak linker references is a breaking change** — maybe we will live to see a sample of valid code that would be broken under this change. – n. m. could be an AI Sep 20 '13 at 17:28
  • "There is even a use case just as I have described" --- there's no C++ code in there. I don't know how non-C++ code is relevant to the discussion. "Where the strong ODR is defined in multiple object files, which are selected at link time." --- Sorry, this sentence doesn't parse. "Do you need me to edit all my comments into my answer" --- I don't see what they add to your answer. – n. m. could be an AI Sep 21 '13 at 12:29
  • You are welcome to play with [this class definition](http://pastebin.com/CZSM8GMC) and come up with an example of a broken program. I have tried but faild. – n. m. could be an AI Sep 21 '13 at 12:46
  • Your assumption is incorrect. You include the same header file in several source files, that's what header files are for. Link resulting objects together and see what happens. If you want to demonstrate a broken valid program you need to define `DO_NOT_EMIT_WEAK_SYMBOLS` for exactly one source file, and provide a standard namespace-scope definition of `Example::a` in that source file. Feel free to change the type of `a` to a non-POD or whatever. – n. m. could be an AI Sep 21 '13 at 13:33
  • I just thought of this as I walked away from my keyboard. Let's try boolean logic... Stroustrop says '[A class is typically declared in a header file] AND [a header file is typically included into many translation units]. HOWEVER...'. In boolean A='class declared in header', B='header included in many TU', HOWEVER=NOT. So !(A && B) == !A || !B. That means to find the reason it cannot be changed (the originally asked question) is the examples where '[A class is not declared in a header file] OR [a header file is not included into many translation units]'. – Strings Sep 21 '13 at 14:13
  • Sorry you've lost me here. I don't think continuing this discussion will be productive. – n. m. could be an AI Sep 21 '13 at 22:00
1

OK, this following example code demonstrates the difference between a strong and weak linker reference. After I will try to explain why changing between the 2 can alter the resulting executable created by a linker.

prototypes.h

class CLASS
{
public:
    static const int global;
};
template <class T>
class TEMPLATE
{
public:
    static const int global;
};

void part1();
void part2();

file1.cpp

#include <iostream>
#include "template.h"
const int CLASS::global = 11;
template <class T>
const int TEMPLATE<T>::global = 21;
void part1()
{
    std::cout << TEMPLATE<int>::global << std::endl;
    std::cout << CLASS::global << std::endl;
}

file2.cpp

#include <iostream>
#include "template.h"
const int CLASS::global = 21;
template <class T>
const int TEMPLATE<T>::global = 22;
void part2()
{
    std::cout << TEMPLATE<int>::global << std::endl;
    std::cout << CLASS::global << std::endl;
}

main.cpp

#include <stdio.h>
#include "template.h"
void main()
{
    part1();
    part2();
}

I accept this example is totally contrived, but hopefully it demonstrates why 'Changing strong to weak linker references is a breaking change'.

Will this compile? No, because it has 2 strong references to CLASS::global.

If you remove one of the strong references to CLASS::global, will it compile? Yes

What is the value of TEMPLATE::global?

What is the value of CLASS::global?

The weak reference is undefined because it depends on the link order, which makes it obscure at best and depending on the linker uncontrollable. This is probably acceptable because it is uncommon not to keep all of the template in a single file, because both prototype and implementation are required together for compilation to work.

However, for Class Static Data Members as they were historically strong references, and not definable within the declaration, it was the rule, and now at least common practice to have the full data declaration with the strong reference in the implementation file.

In fact, because of the linker producing ODR link errors for violations of strong references, it was common practice to have multiple object files (compilation units to be linked), that were linked conditionally to alter behaviour for different hardware and software combinations and sometimes for optimization benefits. Knowing if you made a mistake in your link parameters, you would get an error either saying you had forgotten to select a specialization (no strong reference), or had selected multiple specializations (multiple strong references)

You need to remember at the time of the introduction of C++, 8 bit, 16 bit and 32 bit processors were all still valid targets, AMD and Intel had similar but different instruction sets, hardware vendors preferred closed private interfaces to open standards. And the build cycle could take hours, days, even a week.

Strings
  • 1,674
  • 10
  • 16
  • 1
    Thanks for the detailed answer. I guess then it basically boils down to history -- static data members were (and still are) basically just C global variables with some compile-time access restrictions, and it's too late to change that now. – Tristan Brindle Oct 01 '13 at 03:31