1

I have a list of 20k known strings that I know at compile time and will never change. A sort of non-configurable dictionary. I do not want to load it in run time from a file, because this would imply a lot of unnecessary architecture: finding the file in a certain path, a configuration file to indicate the path, etc.

I came up with a solution like this in C++:

In a.cpp:

std::vector<std::string> dic;
dic.reserve(20000);
#define VECTOR_DIC_ dic;
#include values.inl
#undef VECTOR_DIC_

then in the values.inl, a lis of 20k push_back calls, like this:

VECTOR_DIC_.push_back("string1");
VECTOR_DIC_.push_back("string2");
...
VECTOR_DIC_.push_back("string20000");

This code compiles and works properly with gcc-4.8 on Debian but fails to compile with gcc-4.4, gcc-4.4 never finishes to compile the a.cpp file.

Why does gcc-4.4 not support this type of large initialization? Also, is there a design pattern for such large initialization for known values at compile time?

Malice
  • 1,457
  • 16
  • 32
  • 1
    Just store them in const array? – user7860670 Aug 22 '17 at 11:21
  • 7
    You should store this as a `const char *S[]` and then construct `dic` from `S` - This would allow you to construct `dic` as a `const` vector without having to rely on an initialize function. E.g. https://godbolt.org/g/REZdds – Holt Aug 22 '17 at 11:21
  • 1
    there must be something else fishy in your code, because none of the push_back happens at compile time, so I dont see any reason why this should take long to compile. As you anyhow construct the vector at runtime, I would rather read the values from a file, maybe less efficient, but more clean code – 463035818_is_not_an_ai Aug 22 '17 at 11:21
  • @tobi303 I expected that gcc optimisation would prevent this calls to be executed at runtime, because all pushed values are known at compile time, rather than begin variables. I may be wrong, though. – Jordi Adell Aug 22 '17 at 11:26
  • 4
    @JordiAdell I'll be surprised if that is done at compile time... – Elvis Dukaj Aug 22 '17 at 11:27
  • 2
    @JordiAdell even if there is a compiler that does that (i dont think so) it is not a portable optimization that you can rely on – 463035818_is_not_an_ai Aug 22 '17 at 11:30
  • 2
    An `std::vector` relies on dynamic allocation in its implementation, it's virtually impossible for the compiler to resolve this at compile time. Using a vector for this is a waste of memory (you copy literals into dynamically allocated memory for no good reason) and time; just use a plain array (or, if you really need the vector, make it a `std::vector`, at least you are not copying the string data around). – Matteo Italia Aug 22 '17 at 11:34
  • 2
    ... or use a `std::array`. – YSC Aug 22 '17 at 11:36
  • You could use `boost::list_of` – RazeLighter777 Aug 22 '17 at 11:39
  • @Holt the problem with your suggestion as well as the one of YSC is that gcc-4.4 dos not support C++11, thus I cannot use std::array noir std::begin or std::end – Jordi Adell Aug 22 '17 at 11:44
  • @JordiAdell My bad. Of course it does not. The `const char* DICT[] = { ... }` is then your salvation. – YSC Aug 22 '17 at 11:45
  • 1
    @JordiAdell My suggestion does not rely on `std::array`, and just replace `std::begin(S)`, `std::end(S)` by `S` and `S + 20000`. – Holt Aug 22 '17 at 11:45

2 Answers2

2

Use an array of const char * and then initialize your vector from it:

#include <string>
#include <vector>

char const * const S[] = {
    "string1",
    "string2"
};

const std::size_t N_STRINGS = sizeof(S) / sizeof(*S);

const std::vector<std::string> dic(S, S + N_STRINGS);

This compiles fine (did not test with 20k strings though) with g++ 4.4.7.

Holt
  • 36,600
  • 7
  • 92
  • 139
1

The compiler probably balks because the initialization is not inside a function.

To make it work, insert the initializers inside a function.

As in:

std::vector<std::string> dic;  // wouldn't an std::set be a better match?

bool InitDitionary() {
  dic.reserve(20000);
  #define VECTOR_DIC_ dic;
  #include values.inl
  #undef VECTOR_DIC_
  return true;
}

// you can then call InitDictionary at your discretion from within your app
// or the following line will initialize before the call to main()
bool bInit = InitDictionnary();

Or, the static const char* alternative is also viable, you'd have to change you strings file to this format, I suggest you include the entire declaration, since it's probably generated by software. The array should be sorted beforehand , so you can search it using binary_search, upper_bound, etc....

const char dic[20000] = {  // <-- optional, in the file, so you have the number of items 
    "string1",
    "string2",
    "string3",
    "string4",
    // ...
};
const size_t DIC_SIZE = sizeof(dic) / sizeof(dic[0]);  // :)

You can either give the file a .cpp extension, or include as:

#include "dictionary.inc"
Michaël Roy
  • 6,338
  • 1
  • 15
  • 19