1

I am writing a bitfield abstraction class, which wraps around a 32-bit piece of memory (u32 = unsigned int) and provides access to individual bits or ranges within that memory.

To implement this, I have used a std::map where the unique key is a pointer (not std::string) to a C-character array representing the mnemonic, and the value is a struct containing the bitfield properties (such as mnemonic, starting position, length, initial value and field value). All of these properties are constant and defined on startup, except for the field value which is changed only when the underlying u32 value is changed. (Also note: I have just reused the mnemonic pointer value as the unique key).

This is being used in an emulator where getBitfieldValue(), which returns the bitfield value (read only), is being called many times per second.

On compiling and profiling the code under VS 2015 update 3 (using -O2 and any speed optimisations I could find), it shows that the getBitfieldValue() function, and by extension std::find() is taking up around 60-70% of total cpu time... much too slow.

I have tried using other map implementations, such as Boost::flat_map, google::dense_hash_map or std::unordered_map, and they somewhat help but still end up being too slow (~50-60%).

My guess is I am using a map for the wrong purpose, but I am not sure considering that there is only 5-20 bitfield mappings (small lookup size)... It just seems much too slow. Most of the time would be spent looking up the same field as well.

The relevant class source code can be found here: BitfieldMap32

An example of how the map is initalised at startup (run one-time only):

struct Fields
{
    static constexpr char * ADDR = "ADDR";
    static constexpr char * SPR = "SPR";
};
ExampleClass() // constructor
{
    // registerField(mnemonic, start position, length, initial value)
    registerField(Fields::ADDR, 0, 31, 0);
    registerField(Fields::SPR, 31, 1, 0);
}

And how the field value is accessed (read only):

// getFieldValue definition.
const u32 & BitfieldMap32_t::getFieldValue(const char* fieldName)
{
    return mFieldMap.find(fieldName)->second.mFieldValue;
}

// Field access.
const u32 value = ExampleClassPointer->getFieldValue(Fields::ADDR)

Any ideas on how to reduce the lookup time? Or do I need to change implementation all together?

marco9999
  • 83
  • 8
  • Why not using std::bitset ? – Davidbrcz Sep 29 '16 at 08:47
  • So you measured that your code uses so-and-so much of the CPU. But, is it really *slow*? Percentages of the CPU or of the run-time doesn't matter if it really is "fast enough". What is the run-time of your program? Does it have to wait for user input? If the user inputs something, does the user have to wait more than some milliseconds? How long does the user have to wait? Anything less than half a second will actually seem pretty much instantaneous. Do you really want to complicate things (possibly a *lot*) to run in 0.4 seconds instead of 0.5 seconds? – Some programmer dude Sep 29 '16 at 08:50
  • Perhaps you should consider using `std::unordered_map` which uses hashing and should be faster, whereas `std::map` uses binary search. – rakib_ Sep 29 '16 at 08:57
  • 1
    If you really have 5-20 bitfield mappings then you may find that linear iteration over a vector works faster than different kinds of maps. – Alexey Guseynov Sep 29 '16 at 08:58
  • 1. Not sure what you mean by use std::bitset - at the very least I would have to perform an iteration over the range, since it has no "range" operator built in. 2. Yes it is slow as in real world performance. The emulated CPU runs at ~300 MHz and I very much doubt it is anywhere near that at the moment. User input will be applied eventually. 3. Already tried std::unordered_map, does not help significantly. 4. I may try that as suggested also by one of the answers if nothing else helps. – marco9999 Sep 29 '16 at 09:31
  • @rakib `std::map`'s implementation is compiler-designer defined. I don't think the standard guarantees a certain implementation; though it does require ordering. – erip Sep 29 '16 at 11:56
  • @erip I'm stating this as per c++'s reference manual. – rakib_ Sep 29 '16 at 14:14
  • @rakib I'm looking at the standard and it doesn't say anything about binary search trees. – erip Sep 29 '16 at 15:06
  • @erip - Ok. I might be wrong. But, here is the link - http://www.cplusplus.com/reference/map/map/ . The last line of map description " Maps are typically implemented as binary search trees." – rakib_ Sep 29 '16 at 16:37

1 Answers1

4

IIUC, using a dictionary (std::map or std::unordered_map) is a huge overkill. Perhaps you should use the following:

  1. The class should just be a wrapper around an internal storage of an integer (or at most an std::bitset).

  2. The mnemonics should be enums, not std::strings.

  3. Internally, have an std::vector efficiently mapping each enum value to a bitmask. (If you're using c++11 enums, see here how to convert an enum value into a position within the std::vector).

  4. Each operation should just take the mnemonic, find by index the bitmask, and apply it to the internal storage.

Community
  • 1
  • 1
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • As mentioned in my comment, I'm not sure how std::bitset will help with performance when I would need to iterate over a bitset since it provides no "range" function (it is not always single bits). I will try the combination of enum's & vectors and see how that goes. – marco9999 Sep 29 '16 at 09:35
  • @marco9999 I'll look at your code some more, then. In any case, I suggest you consider replacing the `std::string` mnemonics by `enum`s. – Ami Tavory Sep 29 '16 at 09:38
  • Ended up using a std::vector and ordered index keys starting from 0 (enum/ints). Works much better. Guess map was overkill for it. – marco9999 Sep 29 '16 at 12:57
  • @marco9999 Glad it helped you. Sorry I didn't get a chance to look further into your code (work, you know...). All the best. – Ami Tavory Sep 29 '16 at 12:57