3

I have a class that has 3-4 data members of type std::map<string, vector<string>> which is used for caching the data. Its instance is created once and the data is filled in all the maps from a service call. I have getter functions that are used to access these maps. (this has some thread locking logic in it as well)

These getter functions gets called a lot of times and I am worried about the performance as the map objects are being copied a lot of times.

class A {
  private:
    map<string, vector<string>> m1;
    map<string, vector<string>> m2;
    map<string, vector<string>> m3;

  public:
    map<string, vector<string>> getm1() { return m1; }
    map<string, vector<string>> getm2() { return m2; } 
    map<string, vector<string>> getm3() { return m3; }
}

class B {
   B() { }
   static A a;

   map<string, vector<string>> getm1() { return a.m1; }
   map<string, vector<string>> getm2() { return a.m2; } 
   map<string, vector<string>> getm3() { return a.m3; } 
} 

There are multiple times these getter functions get called from class B. Having intermediate knowledge on cpp, I know that the getter would return an entire map by value. But, would it be better to pass it by reference or using a shared pointer to the map like storing the member variables m1, m2, m3 as shared_ptr<map>

  shared_ptr<map<string, vector<string>>> getm1() { return a.m1; } 

Is this not a performance concern and will be taken care of, by the compiler ?

After reading a little bit about Return value optimization and understanding it a little bit, compilers can handle some optimizations. Is this a part of RVO ?

Thank you in advance.

Walter
  • 44,150
  • 20
  • 113
  • 196
Cool Geek
  • 71
  • 5
  • 4
    Why not just make the maps public? Or, what is the purpose of your getters? – juanchopanza Feb 20 '18 at 22:06
  • 2
    What is the reason for returning a copy of the map? Unless the client wants to play around with their own copy, if the intent is to return the original map, you should be returning a reference to the map anyway, not a copy. – PaulMcKenzie Feb 20 '18 at 22:09
  • Are you intending to return a copy, or do you just want to access the map? – Robinson Feb 20 '18 at 22:09
  • Just updated the description. I have some thread locking logic inside the getting which is why I have the getter function – Cool Geek Feb 20 '18 at 22:10
  • @robinson I only want to access the map. – Cool Geek Feb 20 '18 at 22:10
  • 1
    @CoolGeek *I only want to access the map* -- Then this all becomes a moot point. Return a reference to the map. If you really did return a copy, then prepare for iterator-related bugs once you started to use the copy as if you're accessing the original. – PaulMcKenzie Feb 20 '18 at 22:12
  • Use a const reference (assuming you don't wish to change it), which would be map> const &. – Robinson Feb 20 '18 at 22:12
  • RVO is applicable to the temporary object in your function return which you have none – Killzone Kid Feb 20 '18 at 22:12
  • If you have a multithreading at play here (which you should have mentioned in the very beginning!) you probably would want a shared pointer and locking around it. – SergeyA Feb 20 '18 at 22:16
  • 1
    If you have multithreading and any chance that one thread will be modifying the map while others use it, then you should not give direct access to the map. Not with a reference, not with a shared pointer. – juanchopanza Feb 20 '18 at 22:24
  • Shared pointers are irrelevant here. Only use them if the caller needs to keep using the object after the entire cache has been destroyed (unlikely). – Galik Feb 20 '18 at 22:53
  • Are your maps ever modified after construction of the class? If not, then obviously make them `const` members and return `const map_type&` from the getters. Otherwise, life is more complicated, see [Galik's answer](https://stackoverflow.com/a/48895429/1023390) for an example. – Walter Feb 20 '18 at 22:54
  • Thank you all for your comments! – Cool Geek Feb 21 '18 at 15:39

2 Answers2

9

Return references (&) to const maps.

class A
{    
    using Map_type = std::map<std::string, std::vector<std::string>>;

    Map_type m1;
    Map_type m2;
    Map_type m3;

  public:
    const auto& getm1() const { return m1; }
    const auto& getm2() const { return m2; } 
    const auto& getm3() const { return m3; }
};

That will allow the calling function "read only" access to the maps without paying the price of a copy.


For C++11 and lower, instead of auto as the return type , the return type has to to be declared on the function. Also using is available only for C++11 and above, but for earlier compilers typedef has to be used.

class A
{    
    typedef std::map<std::string, std::vector<std::string>> Map_type;

    Map_type m1;
    Map_type m2;
    Map_type m3;

  public:
    const Map_type& getm1() const { return m1; }
    const Map_type& getm2() const { return m2; } 
    const Map_type& getm3() const { return m3; }
};
Robert Andrzejuk
  • 5,076
  • 2
  • 22
  • 31
1

If you have internal locking then you can't just return references to your cached maps without causing race conditions. You also need to provide the caller with a means to lock the data from modification. If it is efficiency you are looking for then one design to consider is something like this:

class Cache
{
public:
    using mutex_type   = std::shared_timed_mutex;
    using reading_lock = std::shared_lock<mutex_type>;
    using writing_lock = std::unique_lock<mutex_type>;

    using map_type = std::map<std::string, std::vector<std::string>>;

    reading_lock lock_for_reading() const { return reading_lock{mtx}; }
    writing_lock lock_for_writing()       { return writing_lock{mtx}; }

    map_type const& use() const { return m; }

private:

    void update_map()
    {
        // protect every update with a writing_lock
        auto lock = lock_for_writing();

        // safely update the cached map
        m["wibble"] = {"fee", "fie", "foe", "fum"};
    }

    mutable mutex_type mtx;
    map_type   m = {{"a", {"big", "wig"}}, {"b", {"fluffy", "bunny"}}};

};

int main()
{
    Cache cache;

    { // start a scope just for using the map

        // protect every access with a reading_lock
        auto lock = cache.lock_for_reading();

        // safely use the cached map
        for(auto const& s: cache.use().at("a"))
            std::cout << s << '\n';

    } // the lock is released here

    // ... etc ...
}

By making the locking available to the caller you let them protect the data from race conditions while they are reading it. By using both read & write locks you gain performance because you know the callers will not be modifying the cached data.

Internally, when you update the cached maps, you need to use the writing_lock to make sure no one else is reading them while they are being updated.

For efficiency's sake you probably want a separate mutex for each map but it depends on the specific dynamics of your situation.

Note: This solution puts responsibility on the caller to correctly lock the data. A more robust (and complex) solution is suggested here: An idea for the GSL to make multithreading code safer with an example implementation here: gsl_lockable.

Galik
  • 47,303
  • 4
  • 80
  • 117
  • Doesn't this design allow modification of the container while reading? – Robert Andrzejuk Feb 21 '18 at 09:54
  • @RobertAndrzejuk Not if you apply the locks as shown in the example. A `write_lock` will kick out all the readers before proceeding. But it does put responsibility on the caller to correctly lock the resource. Improving on that gets quite complicated. – Galik Feb 21 '18 at 10:04
  • 1
    I have a similar locking mechanism for reads and writes. Thanks for the detailed explanation. Really appreciate it! – Cool Geek Feb 21 '18 at 15:40