5

Let's say I want to have a container of Apples that have different prices. I want them sorted by their price always (highest price first), but I also want to quickly retrieve them by their id. What I have so far is the following

struct AppleClass
{
    string id;
    int price;

    bool operator<(const AppleClass& o) const 
    {
        return price > o.price; 
    }
};

int main() 
{
    set<AppleClass> myapples;
    myapples.insert({"apple1", 500});
    myapples.insert({"apple2", 600});
    myapples.insert({"apple3", 400});

    for (auto& apple : myapples)
    {
        cout << apple.id << "," << apple.price << endl;         
    }
}

http://ideone.com/NcjWDZ

My app will spend 20% of it's time removing entries, 20% inserting entries, 25% retrieving them (retrieving the whole list), and 35% updating them (their price will increase or decrease) often.

The container will have a maximum of 450 entries.

My code only solves the sort problem. Find is useless since I want to find by their id (so I need to iterate trough all of them). Removing and inserting would be slow for the same reason.

It feels like the wrong choice.

But If I have a map then it would be ordered based on the id. And every time I retrieve the list I would have to copy it to some container for example, order it, and then send it to the user, which also feels slow.

Help!

Gam
  • 1,254
  • 1
  • 9
  • 18
  • 1
    To address this question requires a bit more chattiness than is acceptable for stack overflow questions. If you've written the code to attempt to address your problem and it just wasn't working correctly, it's on topic here. If your question is "how do I get started with this?" it's not. – mah May 09 '16 at 01:41
  • @mah So according to you, if my code works, but is slow, then I shouldn't ask in stackoverflow which other container would fix it? since I have no idea? – Gam May 09 '16 at 01:50
  • You haven't really given enough information. Why does retrieving in different ways even matter? However, a simple way is to use a `std::vector`, and maintain two copies - one sorted by price, and the other by id. Also, instead of `id` being `const char *`, use `std::string`. That allows ids to be set a run time (e.g. based on user input), rather than requiring string literals. – Peter May 09 '16 at 01:51
  • 2
    You could use a second container to provide fast indexing into your existing `set`, i.e. `std::map::iterator>`. boost provides a [multi-indexed container library](http://www.boost.org/doc/libs/1_58_0/libs/multi_index/doc/index.html) to simplify this. – Tony Delroy May 09 '16 at 01:53
  • What is the problem size? – Ivan Aksamentov - Drop May 09 '16 at 01:53
  • You have 2 words "feels" in performance-related problem description. In no case should you rely on your feelings and intuition when reasoning about runtime performance. Compare [asymptotic complexities](http://stackoverflow.com/a/16365513/1607442), estimate constants involved and necessarily measure a real-life (!) application. Profile caches and memory. Find ways to parallelize your application. Otherwise you are just guessing and designing perfectly solid apples in vacuum and wasting your and our time. – Ivan Aksamentov - Drop May 09 '16 at 02:01
  • One flaw of your current design, is that apple prices are likely to duplicate (unless they are unique in this case), and `std::set` will only store one of the apples of the same price. You can try 'std::multiset'. But before you do, consider the access pattern to your apple container: how many access by id and how many access to the price-sorted list. – xiaofeng.li May 09 '16 at 02:16
  • 1
    @TonyD Thanks, boost MultiIndex seems good. – Gam May 09 '16 at 08:56
  • @Phantom that is correct; I would encourage you to read the Help pages (linked at the top of every page). While some of that can be difficult to apply, surely you agree that your question is looking for an opinion and that there are multiple valid opinions (which disagree with each other). Also consider that one of the standard reasons to close a question is: Primarily opinion based: Many good questions generate some degree of opinion based on expert experience, but answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise. – mah May 09 '16 at 15:44

1 Answers1

1

You can do it with two containers, one that's kept sorted by price (priority queue or linked list), and one that indexes your ids (a hash map). In order to save space, you can have both structures keep only pointers to your Apple instances, you'll need to write a custom sort functor for that though.

This way, your entry removal is O(1), insertion is O(log n), retrieval is also O(1), and updating is O(log n). I think that should be good, especially when you're dealing with such a small number of items (450).

EDIT:

Elaborating on operation costs:

All these operations are constant time for the hash map, so this is about the priority queue:

  • Removal: you can get an amortized cost of O(1) with the priority queue if you defer that cost to inserts. There are lots of clever implementations out there that will show you how to do this.
  • Insertion: Any basic priority queue implementation will do this in O(log n), it gets a little tricky to keep it that way if you want constant time removal.
  • Retrieval: You use the map for this, no need to look through the priority queue.
  • Updating: I don't think there's a (easy) way to get better complexity than O(log n) for updating, even amortized, you'll probably have to shuffle things around in the priority queue for the average case.
yelsayed
  • 5,236
  • 3
  • 27
  • 38