5

I have a std::map object map<string , Property*> _propertyMap, where string is the property's name and Property* contains the property values.

I need to process the properties values and convert them to a specific data format- each property has its own format, e.g.. if the map initialization is as following:

_propertyMap["id"]   = new Property(Property::UUID, "12345678");
_propertyMap["name"] = new Property(Property::STRING, "name");
....

then "id" should be processed differently than "name" etc.

This means that I need to look for each property in the map and process its values accordingly.

I thought about two ways to do that.

One, use std::map::find method to get a specific property, like that:

map<string , Property*>::iterator it1 = _propertyMap.find("id");
if(it1 != _propertyMap.end())
{
   //element found - process id values
} 
map<string , Property*>::iterator it2 = _propertyMap.find("name");
if(it2 != _propertyMap.end())
{
   //element found - process name values
} 
....

Two, iterate the map and for each entry check what the property's name is and proceed accordingly:

for (it = _propertyMap.begin(); it != _propertyMap.end(); ++it )
    {
        //if it is events - append the values to the matching nodes
        if (it->first == "id")
        {
           //process id values
        }
        else if (it->first == "name")
                {
                    //process name values
                }
        .....
    }

Given that the Time complexity of std::map::find is O(logN), the complexity of the first solution is O(NlogN). I'm not sure about the complexity of the second solution, because it iterates the map once (O(N)), but performs a lot of if-else each iteration. I tried to google common map::find() questions, but couldn't find any useful information; most of them just need to get one value from the map, and then find() does this with better complexity (O(logN) vs O(N)).

What is a better approach? or perhaps there is another one which I didn't think of?

Also, code styling speaking, which one is more good and clear code?

Community
  • 1
  • 1
user3114639
  • 1,895
  • 16
  • 42
  • Looks some how strange and not clear (for me). E.g in your `_propertyMap["id"]` should only be one element, or should this be a list or vector? Than you are going to write N else if or find statement? If I remember correctly for complexity theory only compares are counted. Which will be N*N for your second "solution" – hr_117 Mar 20 '16 at 16:25
  • The map contains properties, each property has a key (`string`) and value (object of type `Property*`). In the example, `_propertyMap["id"]` is an entry that contains the key "id" and the value `new Property(Property::UUID, "12345678")` and so on for each map entry. Then I want to process the data so I need to find a specific entry, which I don't know how to find; using `find` or a loop. – user3114639 Mar 20 '16 at 16:30
  • @hr_11, This is my question; is the second one indeed O(N*N)? because not all the `if-else` will be reached each iteration, so maybe it's O(NlogN) as well. Also I'm not sure what is a better and clearer code. – user3114639 Mar 20 '16 at 16:37
  • If the complexity of the first approach is `O(NlogN)`, then the complexity of the second is `O(N^2)`. – juanchopanza Mar 20 '16 at 16:39
  • @user3114639 Time both approaches. Which one is better depends on the ratio between number of elements in the map and the number of things you want to find. – LogicStuff Mar 20 '16 at 16:40
  • @user3114639 yes it will be O(N*N*k) where k may be 1/2 but does not netter (form theoretical point of view)..So O(N*N) – hr_117 Mar 20 '16 at 16:40
  • If you put the different "process name/id/... value" into a virtual function (You need than PrpoertyID and so on) of Property, than you don't need any compare and the complexity would be O(N) for the loop solution. – hr_117 Mar 20 '16 at 16:55
  • The first one is cleaner the code I think. In terms of complexity I don't think it matters, the map is probably small enough that the processing code takes way longer anyway. It's worth profiling if you think it may be an issue. – Claudiu Mar 20 '16 at 17:22

3 Answers3

2

I see a few different use-cases here, depending on what you have in mind:

Fixed properties

(Just for completeness, i guess it is not what you want) If both name and type of possible properties should be fixed, the best version is to use a simple class/struct, possibly using boost::optional (std::optional with C++17) for values that might be present or not

struct Data{
    int id = 0;
    std::string name = "";
    boost::optional<int> whatever = boost::none;
}

Pros:

  • All "lookups" are resolved at compile-time

Cons:

  • No flexibility to expand at runtime

Process only specific options depending on their key

If you want to process only a specific subset of options, but keep the option to have (unprocessed) custom keys your approaches seem suitable.

In this case remember that using find like this:

it1 = _propertyMap.find("id");

has complexity O(logN) but is used M times, with M beeing the number of processed options. This is not the size of your map, it is the number of times you use find() to get a specific property. In your (shortened) example this means a complexity of O(2 * logN), since you only look for 2 keys.

So basically using M-times find() scales better than looping when only the size of the map increases, but worse if you increase the number of finds in the same manner. But only profiling can tell you which one is the faster for your size and use case.

Process all options depending on type

Since your map looks a lot like the keys can be custom but the types are from a small subset, consider looping over the map and using the types instead of the names to determine how to process them. Something like this:

for (it = _propertyMap.begin(); it != _propertyMap.end(); ++it )
{
    if (it->first.type() == Property::UUID)
    {
       //process UUID values
    }
    else if (it->first.type() == Property::STRING)
    {
        //process STRING values
    }
    .....
}

This has the advantage, that you do not need any information about what the keys of your map really are, only what types it is able to store.

Community
  • 1
  • 1
Anedar
  • 4,235
  • 1
  • 23
  • 41
1

Suppose we have a map of N properties, and we are looking for a subset of P properties. Here a rough analysis, not knowing the statistical distribution of the keys:

  • In the pure map approach you search P times with a complexity of O(log(n)), that is O(p*log(n))

  • In the chained-if approach you are going to traverse once the map. That's O(N). But you should not forget that an if-then chain is also a (hiden) traversal of list of P elements. So for every of the N elements you are doing a search of potentially up to P elements. So that you have here a complexity of O(p*n).

This means that the map approach will outperform your traversal, and the performance gap will increase significantly with n. Of course this doesn't take into account function call overhead in map that you don't have in the if-chain. So that if P and N are small, your approach could still stand the theoretical comparison.

What you could eventually do to increase peformance further would be to use an unordered_map, which is O(1) in complexity, reducing your problem complexity to O(P).

Christophe
  • 68,716
  • 7
  • 72
  • 138
1

There is another option which combines the best of both. Given a function like this (which is an adaptation of std::set_intersection):

template<class InputIt1, class InputIt2, 
         class Function, class Compare>
void match(InputIt1 first1, InputIt1 last1,
           InputIt2 first2, InputIt2 last2,
           Function f, Compare comp)
{   
    while (first1 != last1 && first2 != last2) {
        if (comp(*first1,*first2)) {
            ++first1;
        } else {
            if (!comp(*first2,*first1)) {
                f(*first1++,*first2);
            }
            ++first2;
        }
    }
}

You can use it to process all your properties in O(N+M) time. Here is an example:

#include <map>
#include <string>
#include <functional>
#include <cassert>

using std::map;
using std::string;
using std::function;

struct Property {
  enum Type { UUID, STRING };
  Type type;
  string value;
};

int main()
{
    map<string,Property> properties;
    map<string,function<void(Property&)>> processors;

    properties["id"] = Property{Property::UUID,"12345678"};
    properties["name"] = Property{Property::STRING,"name"};

    bool id_found = false;
    bool name_found = false;

    processors["id"]   = [&](Property&){ id_found = true; };
    processors["name"] = [&](Property&){ name_found = true; };

    match(
       properties.begin(),properties.end(),
       processors.begin(),processors.end(),
       [](auto &a,auto &b){ b.second(a.second); },
       [](auto &a,auto &b) { return a.first < b.first; }
    );

    assert(id_found && name_found);
}

The processors map can be built separately and reused to reduce the overhead.

Vaughn Cato
  • 63,448
  • 5
  • 82
  • 132