Why should I use the extern keyword to declare variables in a namespace scope?

Question

I am quite new to C++ and am currently taking a short-course on it. I have some background in Java.

I wish to have a namespace called "Message", that will be used to store unchanging/constant strings that will be used in a variety of different classes throughout my program. (E.g titles, keywords, names, etc).

If all of these strings were in a class, they would be const and static, thus I feel it is best to put them into a namespace rather than a class. My current "Message.h" looked a bit like this:

#ifndef MESSAGE
#define MESSAGE

#include <string>

namespace Message {
    const std::string NAME = "Car";
    const std::string SEPARATE = " | ";
    const std::string COMMAND = "Please enter a 1, 2 or a 3: ";
};

#endif MESSAGE

Until an instructor suggested that I change it to this...

Message.h:

#ifndef MESSAGE
#define MESSAGE

#include <string>

namespace Message {
    extern const std::string NAME;
    extern const std::string SEPARATE;
    extern const std::string COMMAND;
};

#endif MESSAGE

Message.cpp:

#include "Message.h"

const std::string Message::NAME = "Car";
const std::string Message::SEPARATE = " | ";
const std::string Message::COMMAND = "Please enter a 1, 2 or a 3: ";

I had little time for clarification from the instructor before the end of a session and it will be quite some-time before I get the opportunity to. From what I've researched, it has to do with translation-units and more specifically trying to use a variable in a different translation unit.

I understand the general concept of this, but what I can't quite catch is the benefits of using extern in this context?

Won't the include guards be enough here that the Message:: namespace variables won't be declared/defined more than once? Why is the extern keyword recommended in this context and is this purely for the benefits of compile-speed?

The "benefits" is that the end result is valid C++, when multiple translation units are used, instead of a link failure, undefined behavior, and/or violation of the [One Definition Rule](https://stackoverflow.com/questions/4192170/what-exactly-is-one-definition-rule-in-c). The `extern` keyword is not "recommended". It is required. — Sam Varshavchik, Jan 01 '19 at 15:46
Honestly, this is something you can try rather easily yourself. What happens when you include `Message.h` in multiple files, build and link them all together? — StoryTeller - Unslander Monica, Jan 01 '19 at 15:48
I recommend that you learn more about the difference between *declaration* and *definition*. You can have as many declarations of a symbol as you like (if they are all the same of course) but only one single definition. — Some programmer dude, Jan 01 '19 at 15:51
@SamVarshavchik So... In the first case where a header file is used, it's declaring and defining the variables where every time Message.h is included, thus breaking the One Definition Rule. The only thing allowing it to compile is the include guards? The program successfully compiles and runs as expected in both cases(VS 2017). — Jaymaican, Jan 01 '19 at 15:59
@Jaymaican -- it looks like folks have overlooked the `const` on each of those string objects. Those objects have internal linkage (i.e., they are not visible from outside any translation unit that includes that header), so there is no ODR violation. — Pete Becker, Jan 01 '19 at 16:05

score 3 · Accepted Answer · answered Jan 01 '19 at 16:04

3

Both will work just fine (despite the knee-jerk comments the question has gotten). The difference is a bit subtle.

In the first example, every translation unit (think .cpp file) that includes that header will get a copy of each of those three strings. It's okay to do that, because they're marked const, so they are not exported from the translation unit. If they were not const you'd have multiple definitions of the same symbol.

In the second example, there is exactly one copy of each of the three strings. Those copies live in Message.cpp, and will get linked into your executable. Any translation unit that includes Message.h will know about those names, because they're declared in the header, and can use them.

For small things like constant int values, the first approach is most common. For larger things such as string objects (things that typically require non-trivial initialization), the second approach is most common.

answered Jan 01 '19 at 16:04

Pete Becker

74,985
8
76
165

Thankyou so much! This makes a lot of sense to me. A big emphasis of the course I'm taking is writing C++ code that is to a professional standard. This might be a bit of a vague question but would it be considered less professional/standard to declare and define constant int values in the first way? – Jaymaican Jan 01 '19 at 16:10
@Jaymaican -- okay, down the rabbit hole. If you don't take the address of a `const int` value, and it gets initialized with a fixed value, the compiler can just use the value wherever it's needed, without storing that value as a named variable anywhere. That's a very common optimization, and it's the main reason for using the first approach. – Pete Becker Jan 01 '19 at 16:15
1

@Jaymaican -- exactly. This is an example of the "as if" rule: the compiler can leave out the object if your code can't detect that it did it. – Pete Becker Jan 01 '19 at 16:45
Thanks again! (Apologies, but I'm a little short on the terminology here) What is meant by not taking the address of a const int value? Is this in relation to not creating a pointer/reference to it? (To anyone reading this, I accidentally deleted this comment.) – Jaymaican Jan 01 '19 at 16:46

Why should I use the extern keyword to declare variables in a namespace scope?

1 Answers1