2

We are working on a program to solve a Boggle game, and the whole program must be executed in less than 0.1s. We insert a dictionary into our code by standard input (which lasts 0.05s, the half of our maximum time). We use the following functions to add the words into the dictionary, which last 0.1 seconds each:

void Node::addWord(const char* word){
  const char idx = *word - 'a';
  if(!mChildren[idx]) mChildren[idx] = new Node(*word);
  if(strlen(word) > 1) mChildren[idx]->addWord(word + 1);
  else mChildren[idx]->setMarker(true);
}

void Trie::addDictionary(const vector<string>* aux){
  auto length = aux->size();
  Node* current = root;
  for (size_t i = 0; i < length; i++) {
    int len = aux->at(i).length();
    if (len >= 3 && len <= DIM*DIM + 1) {
        current->addWord(aux->at(i).c_str());
    }
  }
}

In this case, DIM = 4 (is the dimension of the NxN board of the Boggle) and aux is a vector where all the data from the standard input is dumped. We impose the condition of len >= 3 because we only want words with 3 letters and more.

Any ideas to improve the speed of these functions?

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
  • 1
    Pass the length of the string into addWord - calling strlen as many times as you do is completely uncalled for – UKMonkey Nov 01 '16 at 21:02
  • Using `operator[]` instead of `at()` could be a bit faster because it does not check for out-of-bounds. – alain Nov 01 '16 at 21:11
  • You are not only calling `at` which is slower and redundant in your code, but do it multiple times. Use const reference and range loop `for( const auto &word : *aux ) { ... }`. You should use const reference instead of pointer to vector as well. – Slava Nov 01 '16 at 21:29
  • You may get rid of `if (len >= 3 && len <= DIM*DIM + 1) {` by filtering your input data file before feed it. – Jarod42 Nov 01 '16 at 21:41
  • You use `at` which does bound checking in the safe places, and use `[]` in the unsafe places (as input data might have invalid char) :/ – Jarod42 Nov 01 '16 at 21:48
  • Change `if(strlen(word) > 1) mChildren[idx]->addWord(word + 1);` to `if(word[1]) mChildren[idx]->addWord(word + 1);` – Christopher Oicles Nov 01 '16 at 22:30

1 Answers1

2

The best way to improve the speed of these function is to run a profiler on them. I would not consider optimizing code like this until you run a profiler against it.

That being said, my prediction will be that, if you run the profiler, you will find that a large portion of your runtime (like 90%) will be in new Node(*word). Heap allocations are slow compared to the other operations you are doing. How slow? It depends on your platform, which is why you have to profile to find hotspots. What consumes large amounts of time on one platform may consume very little time on others.

Run a profiler, determine if my claim is correct. If it is correct, then I recommend pool allocating Nodes to decrease the number of allocations that have to occur.

Cort Ammon
  • 10,221
  • 31
  • 45
  • Thanks! Our problem was that when we generated the Node, instead of having preallocated the different possible children (the maximum number of children was 26), using a null pointer, we used an operation to resize the vector every time we introduced a new element. We changed the resize operation to a static declaration of the vector and improved the time from 120ms to 70ms. – Julen Cestero Nov 07 '16 at 11:30