Best way to store constant data in C++

Question

I have an array of constant data like following:

enum Language {GERMAN=LANG_DE, ENGLISH=LANG_EN, ...};
struct LanguageName {
    ELanguage language;
    const char *name;
};

const Language[] languages = {
    GERMAN, "German",
    ENGLISH, "English",
    .
    .
    .
};

When I have a function which accesses the array and find the entry based on the Language enum parameter. Should I write a loop to find the specific entry in the array or are there better ways to do this.

I know I could add the LanguageName-objects to an std::map but wouldn't this be overkill for such a simple problem? I do not have an object to store the std::map so the map would be constructed for every call of the function.

What way would you recommend?

Is it better to encapsulate this compile time constant array in a class which handles the lookup?

sbi · Accepted Answer · 2009-09-28T09:53:05.013

3

If the enum values are contiguous starting from 0, use an array with the enum as index.

If not, this is what I usually do:

const char* find_language(Language lang)
{
  typedef std::map<Language,const char*> lang_map_type;
  typedef lang_map_type::value_type lang_map_entry_type;

  static const lang_map_entry_type lang_map_entries[] = { /*...*/  }
  static const lang_map_type lang_map( lang_map_entries
                                     , lang_map_entries + sizeof(lang_map_entries)
                                                        / sizeof(lang_map_entries[0]) );
  lang_map_type::const_iterator it = lang_map.find(lang);
  if( it == lang_map.end() ) return NULL;
  return it->second;
}

If you consider a map for constants, always also consider using a vector.

Function-local statics are a nice way to get rid of a good part of the dependency problems of globals, but are dangerous in a multi-threaded environment. If you're worried about that, you might rather want to use globals:

typedef std::map<Language,const char*> lang_map_type;
typedef lang_map_type::value_type lang_map_entry_type;

const lang_map_entry_type lang_map_entries[] = { /*...*/  }
const lang_map_type lang_map( lang_map_entries
                            , lang_map_entries + sizeof(lang_map_entries)
                                               / sizeof(lang_map_entries[0]) );

const char* find_language(Language lang)
{
  lang_map_type::const_iterator it = lang_map.find(lang);
  if( it == lang_map.end() ) return NULL;
  return it->second;
}

edited Sep 28 '09 at 09:53

answered Sep 28 '09 at 09:31

sbi

219,715
46
258
445

Are you worried about memory consumption or performance loss when you do this? In this simple example it can surely be ignored but in large compile time constant arrays it may be significant? – frast Sep 28 '09 at 09:37
I think in a multi-threaded application it could be dangerous to instantiate static objects in functions. Is this right? – frast Sep 28 '09 at 09:43
I will go with your suggestion because it is not likely that my function will be called in multi-threading and it is a really clean and simple approach. Just what I wanted. Thank you very much. – frast Sep 28 '09 at 09:48
@frast: If you're worried about memory/performance, then measure. Your code sounds like it has only a few dozen entries. I wouldn't worry about that. If you do, look at `std::vector >` (or `boost::array`). I'll edit the answer regarding MT. – sbi Sep 28 '09 at 09:50
Thank you again. std::vector > is an interesting idea. I can not use boost::array because we do not intend to add this as a dependancy to our project. – frast Sep 28 '09 at 10:24
+1 This is more or less what I was going to suggest too. The memory usage is going to be almost the same as in your example, and take about the same time to compile. Or were you thinking of loading a file at runtime? Putting it in its own module may help a little. – gavinb Sep 28 '09 at 10:35
If I would do a std::find_if I would save the memory consumed by copying all to the map. But in this case readability and simplicity is more important than memory consumption or performance. – frast Sep 28 '09 at 10:54

score 2 · Answer 2 · answered Sep 28 '09 at 11:13

There are three basic approaches that I'd choose from. One is the switch statement, and it is a very good option under certain conditions. Remember - the compiler is probably going to compile that into an efficient table-lookup for you, though it will be looking up pointers to the case code blocks rather than data values.

Options two and three involve static arrays of the type you are using. Option two is a simple linear search - which you are (I think) already doing - very appropriate if the number of items is small.

Option three is a binary search. Static arrays can be used with standard library algorithms - just use the first and first+count pointers in the same way that you'd use begin and end iterators. You will need to ensure the data is sorted (using std::sort or std::stable_sort), and use std::lower_bound to do the binary search.

The complication in this case is that you'll need a comparison function object which acts like operator< with a stored or referenced value, but which only looks at the key field of your struct. The following is a rough template...

class cMyComparison
{
  private:
    const fieldtype& m_Value;  //  Note - only storing a reference

  public:
    cMyComparison (const fieldtype& p_Value) : m_Value (p_Value) {}

    bool operator() (const structtype& p_Struct) const
    {
      return (p_Struct.field < m_Value);
        //  Warning : I have a habit of getting this comparison backwards,
        //            and I haven't double-checked this
    }
};

This kind of thing should get simpler in the next C++ standard revision, when IIRC we'll get anonymous functions (lambdas) and closures.

If you can't put the sort in your apps initialisation, you might need an already-sorted boolean static variable to ensure you only sort once.

Note - this is for information only - in your case, I think you should either stick with linear search or use a switch statement. The binary search is probably only a good idea when...

There are a lot of data items to search
Searches are done very frequently (many times per second)
The key enumerate values are sparse (lots of big gaps) - otherwise, switch is better.

If the coding effort were trivial, it wouldn't be a big deal, but C++ currently makes this a bit harder than it should be.

One minor note - it may be a good idea to define an enumerate for the size of your array, and to ensure that your static array declaration uses that enumerate. That way, your compiler should complain if you modify the table (add/remove items) and forget to update the size enum, so your searches should never miss items or go out of bounds.

score 2 · Answer 3 · edited May 23 '17 at 10:30

I think you have two questions here:

What is the best way to store a constant global variable (with possible Multi-Threaded access) ?
How to store your data (which container use) ?

The solution described by sbi is elegant, but you should be aware of 2 potential problems:

In case of Multi-Threaded access, the initialization could be skrewed.
You will potentially attempt to access this variable after its destruction.

Both issues on the lifetime of static objects are being covered in another thread.

Let's begin with the constant global variable storage issue.

The solution proposed by sbi is therefore adequate if you are not concerned by 1. or 2., on any other case I would recommend the use of a Singleton, such as the ones provided by Loki. Read the associated documentation to understand the various policies on lifetime, it is very valuable.

I think that the use of an array + a map seems wasteful and it hurts my eyes to read this. I personally prefer a slightly more elegant (imho) solution.

const char* find_language(Language lang)
{
  typedef std::map<Language, const char*> map_type;
  typedef lang_map_type::value_type value_type;

  // I'll let you work out how 'my_stl_builder' works,
  // it makes for an interesting exercise and it's easy enough
  // Note that even if this is slightly slower (?), it is only executed ONCE!
  static const map_type = my_stl_builder<map_type>()
                          << value_type(GERMAN, "German")
                          << value_type(ENGLISH, "English")
                          << value_type(DUTCH, "Dutch")
                          ....
                          ;

  map_type::const_iterator it = lang_map.find(lang);
  if( it == lang_map.end() ) return NULL;
  return it->second;
}

And now on to the container type issue.

If you are concerned about performance, then you should be aware that for small data collection, a vector of pairs is normally more efficient in look ups than a map. Once again I would turn toward Loki (and its AssocVector), but really I don't think that you should worry about performance.

I tend to choose my container depending on the interface I am likely to need first and here the map interface is really what you want.

Also: why do you use 'const char*' rather than a 'std::string'?

I have seen too many people using a 'const char*' like a std::string (like in forgetting that you have to use strcmp) to be bothered by the alleged loss of memory / performance...

"why do you use 'const char*' rather than a 'std::string'?" In a static table like this, `std::string` often doesn't gain much except for the overhead of constructing string objects and copying string literals into arrays on the heap. Of course, if the results of lookup are going to be used as `std::string` rather than `const char*`, using `std::string` in the lookup table might be better. However, from the code given I think there's a good chance the results of lookups are just passed to some C API in the OS. — sbi, Sep 30 '09 at 15:49

score 0 · Answer 4 · answered Sep 28 '09 at 09:25

0

It depends on the purpose of the array. If you plan on showing the values in a list (for a user selection, perhaps) the array would be the most efficient way of storing them. If you plan on frequently looking up values by their enum key, you should look into a more efficient data structure like a map.

answered Sep 28 '09 at 09:25

Adam Maras

26,269
6
65
91

It is very inconvienent to write this array lookup. A map would be easier to handle. I would prefer a way which is easy to handle and has low overhead. – frast Sep 28 '09 at 09:28
May be it becomes easier if I use std::find on the array but I would still have to define a custom compare functor. – frast Sep 28 '09 at 09:30

score 0 · Answer 5 · answered Sep 28 '09 at 09:29

0

There is no need to write a loop. You can use the enum value as index for the array.

answered Sep 28 '09 at 09:29

Ralph

5,154
1
21
19

This is not true. The enum is not defined in this way. Take a look at the enum values: GERMAN=LANG_DE – frast Sep 28 '09 at 09:33

score 0 · Answer 6 · answered Sep 28 '09 at 09:32

0

I would make an enum with sequential language codes

enum { GERMAN=0, ENGLISH, SWAHILI, ENOUGH };

The put them all into array

const char *langnames[] = {
    "German", "English", "Swahili"
};

Then I would check if sizeof(langnames)==sizeof(*langnames)*ENOUGH in debug build.

And pray that I have no duplicates or swapped languages ;-)

answered Sep 28 '09 at 09:32

Michael Krelin - hacker

138,757
24
193
173

Can't renumber? Well, then you just need to decide where you want the tradeoff. And I don't see anything severely wrong with your current implementation then. Of course it depends on how often you need to infer language name from enum value. But then, you can have a cache for a few most recently used ones and it may well save you a lot of cpu cycles. – Michael Krelin - hacker Sep 28 '09 at 10:04

score 0 · Answer 7 · answered Sep 28 '09 at 09:53

If you want fast and simple solution , Can try like this

enum ELanguage {GERMAN=0, ENGLISH=1};

    static const string Ger="GERMAN";
    static const string Eng="ENGLISH";

    bool getLanguage(const ELanguage& aIndex,string & arName)
    {
        switch(aIndex)
        {
        case GERMAN:
            {
                arName=Ger;
                return true;
            }
        case ENGLISH:
            {
                arName=Eng;
            }
        default:
            {
                // Log Error
                return false;
            }
        }
    }

i have written for string, You can use char * – Satbir Sep 28 '09 at 09:54 — Satbir, Sep 28 '09 at 09:54

Best way to store constant data in C++

7 Answers7