5

I'm implementing a multi-index map in C++11, which I want to be optimized for specific features. The problem I'm currently trying to solve, is to not store key elements more then once. But let me explain.

The problem arose from sorting histograms to overlay them in different combinations. The histograms had names, which could be split into tokens (properties).

Here are the features I want my property map to have:

  1. Be able to loop over properties in any order;
  2. Be able to return container with unique values for each property;
  3. Accumulate properties' values in the order they arrive, but to be able to sort properties using a custom comparison operator after the map is filled;

I have a working implementation in C++11 using std::unordered_map with std::tuple as key_type. I'm accumulating property values as they arrive into a tuple of forward_lists. The intended use, is to iterate over the lists to compose keys.

The optimization I would like to introduce, is to only store properties' value in the lists, and not store them in tuples used as keys in the map. I'd like to maintain ability to have functions returning const references to lists of property values, instead of lists of some wrappers.

I know that boost::multi_index has similar functionality, but I don't need the overhead of sorting as the keys arrive. I'd like to have new property values stored sequentially, and only be sortable postfactum. I've also looked at boost::flyweight, but in the simplest approach, the lists will then be of flyweight<T> instead of T, and I'd like to not do that. (If that IS the best solution, I could definitely live with it.)

I know that lists are stable, i.e. once an element is created, its pointer and iterator remain valid, even after invoking list::sort(). Knowing that, can something be done to the map, to eliminate redundant copies of tuple elements? Could a custom map allocator help here?

Thanks for suggestions.

SU3
  • 5,064
  • 3
  • 35
  • 66
  • So you want a structure with unqiue keys but without sorting? How should that be possible? You need your container to be sorted to be able to efficiently check whether an element with a specific key already exists in the container. I think an unordered_map implementation is a good idea. – jplatte Mar 04 '15 at 23:09
  • Also, I didn't look into it in detail, but I noticed one thing: You created your own hashing struct that you passed to the unordered_map template instantiation as third parameter. You could also (partially, in your case) specialize std::hash instead. It's just an alternative though, I don't think it will make a difference in behaviour or performance. One thing that you should change though is the name of your namespace; names with two leading underscores are reserved for compiler- and standard-library-interal use: http://stackoverflow.com/q/228783/1592377 – jplatte Mar 04 '15 at 23:15
  • @jPlatte: I'm not asking for no sorting. I'm just asking for no duplicate data stored. For example, the map could store pointers to key parts stored in lists. The map can sort the pointers as much as it likes. PS: I didn't know about the namespace naming convention. Thanks for the tip. – SU3 Mar 05 '15 at 23:08
  • 1
    I know you didn't specifically ask for no sorting, but you mentioned sorting as an overhead, which it isn't because like I said your container has to be sorted to provide key uniqueness efficiently. About the naming: That isn't a namespace naming convention, it applies to all names, variables, classes, functions and macros as well as namespaces. – jplatte Mar 05 '15 at 23:17

1 Answers1

3

Have your map be from tuples of iterators to your prop containers.

Write a hash the dereferences the iterators and combines the result.

Replace the forward list prop containers with sets that first order on hash, then contents.

Do lookup by first finding in set, then doing lookup in hash.

If you need a different order for props, have another container of set iterators.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • I was already considering that, but the drawback is that in this implementation I can't write a get function which I could use like this `for (auto& p : mymap.get<0>() ) { ... }` to iterate over a particular property, without p being of a pointer or iterator type. Or can I add a list to my class storing references? Could I make it `std::forward_list>`? – SU3 Mar 05 '15 at 23:14
  • @ivan or write an iterator dereferencing iterator? – Yakk - Adam Nevraumont Mar 06 '15 at 00:52
  • Do you mean write a class for the get function to return, which would implement begin() and end() returning iterator like objects which derefence to references to T? – SU3 Mar 06 '15 at 02:42
  • @ivanp sure, but the range portion is easy. The hard part is iterators over a range of elements which are also iterators, that when you dereference the "outer" iterator it does a double-dereference. – Yakk - Adam Nevraumont Mar 06 '15 at 03:14
  • I finally [implemented](https://github.com/ivankp/propmap_cpp11/blob/b9a2efc8ac29286fa45ab930fd5027c24a312b8c/propmap.hh) it this way, except I still populate the lists directly with `prop_t`, the sets' elements are `tuple`, and map key type is `tuple< tuple* ...>`. But similarly to in your scenario, I first find if a new property is already in the set, and if not, I push it into the list, and insert the respective a pointer with a hash into the set. The problem is that the new implementation is almost a factor of 2 slower. This perplexes me. – SU3 Mar 07 '15 at 18:04