14

I have a quad-/octree data structure. Im storing the children indexes/ptrs of a cell in an array. Each position in the array represents the location of a child with respect to its parent, e.g. in 2D:

// _____________
// |     |     |
// |  2  |  3  |
// |_____|_____|
// |     |     |
// |  0  |  1  |
// |_____|_____|
// for each cell, 4 children are always stored in row-major order
std::vector<std::array<Integer,4>> children;

I know that the max number of children is a subset of the values that an Integer type can represent. Thus I can identify if a cell is missing a child by using a ''magic'' value like -1 for Integer = int, or std::numeric_limits<unsigned>::max() for Integer = unsigned. This is something that std::optional<Integer> cannot assume.

As far as I understood, this usage of magic values is one of the raison d'être of std::optional. Still, I'm worried about the performance of std::vector<std::optional<int>> in inner loops.

So,

  • Will the performance of std::vector<std::optional<int>> be worse than that of std::vector<int>? (I'm already doing the comparison for "non-existent" value).

  • Or, can the implementation of std::optional be optimized to offer the same performance as a raw int? And how?

Mixing std::optional in the return type of my functions and magic values in my data structure sounds like a very bad idea. I prefer to be consistent and either use one or the other (at least within the same context). Although I could overload the function that performs the comparison with the magic number:

template<T> bool is_valid(const T& t) { 
  return /* comparison with magic value for t */; 
}

for optional types.

gnzlbg
  • 7,135
  • 5
  • 53
  • 106
  • 1
    Not sure about the performace between the two, that's something you'd have to see for you self. Question though, is using bit-information viable solution for you? Since you are storing indexes, can you encode information about the 4 child indeces in a single int? Since you have 4 children, you can use 4 bytes(or more) for each children, that makes 16 bytes total for indeces, and the other 16 you can use to encode whether its active or not? Make a class to handle this so user wouldn't have to worry about internals – dchhetri Jun 26 '13 at 15:38
  • The raison d'être of `std::optional`, is precisely cases where you *don't* have a sentinel value. If you have an "obvious" sentinel, why wouldn't you use it? – André Caron Jun 26 '13 at 15:39
  • @user814628 Each index is an unsigned long long... I have a lot of cells and cannot do what you propose. – gnzlbg Jun 26 '13 at 15:49
  • 1
    @AndréCaron: It would make for clearer code (more cleary expressing the intent). Indeed, it would be nice if you could declare a sentinel value for user-defined types and tell `optional` to use that. You could even make construction of that sentinel value private with `std::optional` as friend. – celtschk Jun 26 '13 at 18:36
  • 1
    If you use `vector>` you get both type-safety and zero-performance overhead. Try: https://github.com/akrzemi1/compact_optional – Andrzej Dec 03 '15 at 09:00
  • @Andrzej github.com/gnzlbg/ndtree uses a modified version of your `compact_optional` library extensively for this ;) Thanks :D – gnzlbg Dec 08 '15 at 10:27

2 Answers2

15

std::optional is going to require additional storage and fit fewer values into cache (it appears you already know the reason for this).

I don't think it's wrong to have a different value stored internally in your data structure from the one exposed by the public API, as long as the internal representation is completely hidden from users.

Furthermore, I suggest you isolate the magic number into a single pair of inline conversion functions.

The compiler should help you remember to use the conversion functions consistently, by generating type errors if you forget. You might even use a thin struct wrapper for an int in your internal data structure, to ensure that no implicit conversion exists (or define a user-defined conversion).

class CompressedOptionalUInt
{
    static const unsigned SENTINEL_MISSING = std::numeric_limits<unsigned>::max();
    unsigned value;

public:
    CompressedOptionalUInt(std::optional<unsigned> val) : value(!val? SENTINEL_MISSING: *val) {}
    operator std::optional<unsigned>() const { ... }
};

and then use std::array<CompressedOptionalUInt>.

Making that into a template, with just the sentinel needing to be defined for each type, should be pretty straightforward.

Sebastian Redl
  • 69,373
  • 8
  • 123
  • 157
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
3

No, it's not as efficient. As you can see from the reference implementation it has to store, update and check an extra value.

Jack Aidley
  • 19,439
  • 7
  • 43
  • 70