Which container is most efficient for multiple insertions / deletions in C++?

Question

I was set a homework challenge as part of an application process (I was rejected, by the way; I wouldn't be writing this otherwise) in which I was to implement the following functions:

// Store a collection of integers
class IntegerCollection {
public:
  // Insert one entry with value x
  void Insert(int x);

  // Erase one entry with value x, if one exists
  void Erase(int x);

  // Erase all entries, x, from <= x < to
  void Erase(int from, int to);

  // Return the count of all entries, x, from <= x < to
  size_t Count(int from, int to) const;

The functions were then put through a bunch of tests, most of which were trivial. The final test was the real challenge as it performed 500,000 single insertions, 500,000 calls to count and 500,000 single deletions.

The member variables of IntegerCollection were not specified and so I had to choose how to store the integers. Naturally, an STL container seemed like a good idea and keeping it sorted seemed an easy way to keep things efficient.

Here is my code for the four functions using a vector:

// Previous bit of code shown goes here 

private:
  std::vector<int> integerCollection;
};

void IntegerCollection::Insert(int x) {

  /* using lower_bound to find the right place for x to be inserted
  keeps the vector sorted and makes life much easier */
  auto it = std::lower_bound(integerCollection.begin(), integerCollection.end(), x);
  integerCollection.insert(it, x);
}

void IntegerCollection::Erase(int x) {

  // find the location of the first element containing x and delete if it exists
  auto it = std::find(integerCollection.begin(), integerCollection.end(), x);

  if (it != integerCollection.end()) {
    integerCollection.erase(it);
  }

}

void IntegerCollection::Erase(int from, int to) {

  if (integerCollection.empty()) return;

  // lower_bound points to the first element of integerCollection >= from/to
  auto fromBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), from);
  auto toBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), to);

  /* std::vector::erase deletes entries between the two pointers
  fromBound (included) and toBound (not indcluded) */
  integerCollection.erase(fromBound, toBound);

}

size_t IntegerCollection::Count(int from, int to) const {

  if (integerCollection.empty()) return 0;

  int count = 0;

  // lower_bound points to the first element of integerCollection >= from/to
  auto fromBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), from);
  auto toBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), to);

  // increment pointer until fromBound == toBound (we don't count elements of value = to)
  while (fromBound != toBound) {
    ++count; ++fromBound;
  }

  return count;

}

The company got back to me saying that they wouldn't be moving forward because my choice of container meant the runtime complexity was too high. I also tried using list and deque and compared the runtime. As I expected, I found that list was dreadful and that vector took the edge over deque. So as far as I was concerned I had made the best of a bad situation, but apparently not!

I would like to know what the correct container to use in this situation is? deque only makes sense if I can guarantee insertion or deletion to the ends of the container and list hogs memory. Is there something else that I'm completely overlooking?

It seems that you want your container to be sorted, is that correct? Then how about `std::multiset` (or plain `std::set` if you don't want duplicates)? — Some programmer dude, Jun 04 '19 at 19:06
If you know the bounds of your minimum and maximum possible inputs, you could do this with buckets and achieve O[1] insertion, deletion, etc. Takes memory proportional to [max-min]. — ArtHare, Jun 04 '19 at 19:08
`void Erase(int from, int to);` implies ordering, which an unordered collection will not give you without enumeration. If your collection allows duplicates, a `std::multiset` is a viable choice, otherwise `std::set` is the candidate. — WhozCraig, Jun 04 '19 at 19:08
Bjarne says "use vectors" so if the company rejected your code - show them this (very educational) video: https://www.youtube.com/watch?v=YQs6IC-vgmo — YePhIcK, Jun 04 '19 at 19:09
`std::vector` wins in performance over the other containers up to shockingly large data sets simply because it's unbeatable in terms of cache locality. Intuitively, some variant of `std::set` would have been a safer bet, but it's worth measuring at what data set size it starts beating out `std::vector`. `std::set` may very well not catch up to `std::vector` by the time you get to your 500,000 element size. But if they cared about performance, they would have specified/provided the hardware. It seems they were just interested in the order of time complexity. — François Andrieux, Jun 04 '19 at 19:12
How often count will be invoked in comparison with insert, erase? nth_element has lower complexity than sorting. — , Jun 04 '19 at 19:13
what about some cheating? I mean its integers... you could have a `int x[500000]` that stores at `x[i]` how often `i` is present in the container. Inserting and erasing as cheap as it can get — 463035818_is_not_an_ai, Jun 04 '19 at 19:14
@formerlyknownas_463035818 Because unless you live in a 16-bit micro-controler world, you'll need an array of 2gB minimum. The dude gave a decent answer to the dreadfully phrased question from what are no-doubt elitist pinheads. If that wasn't accepted, with no real explanation for why, I somewhat suspect short-cutting a domain-restricted cheat would have found the round-file even faster. The only "correct" answer is "I can't answer that without knowing a LOT more about the problem I'm trying to solve". And I'm sure that too would have been kicked to the curb. — WhozCraig, Jun 04 '19 at 19:17
Note that since your vector is sorted, your `std::find` in `Erase(int x)` could be replaced with a binary search to reduce complexity. `std::lower_bound` will already be doing a binary search, so you're good there. — scohe001, Jun 04 '19 at 19:22
One comment on the code I would make is that I'd watch out for those `if (integerCollection.empty())` checks. They don't seem save a lot of time (the functions are fast and correct anyway when the range is empty) and they introduce branches. Depending on the use case, branches can be a very significant performance sink. Removing them also makes the core logic of your functions that much clearer by eliminating clutter. — François Andrieux, Jun 04 '19 at 19:25
@WhozCraig nice that you are no elitist pinhead, and dont get tired of telling me when I am talking non-sense ;). I have far too little experience with interviews/applications to have even the slightest idea what the "correct" answer is supposed to be. Actually I could imagine that it wasnt rejected because it uses `std::vector` but for some crude reason that we will never know — 463035818_is_not_an_ai, Jun 04 '19 at 19:25
I have a sneaking suspicion that the homework was designed simply to see if the applicant was familiar with `std::map`, or not. — Jeremy Friesner, Jun 04 '19 at 19:26
@formerlyknownas_463035818 No no. Don't misunderstand. Your idea (a lookup table) is outstanding if you're dealing with a well-restricted domain. Ex: if this were a `char` or `unsigned char` collection that idea would be *stellar* (I've used it many times in the past, and performance is blistering). It just isn't a good fit in a domain this large. That's all. — WhozCraig, Jun 04 '19 at 19:27
@WhozCraig all fine, I didnt misunderstand. " It just isn't a good git in a domain this large." ie not the right approach for an interview question — 463035818_is_not_an_ai, Jun 04 '19 at 19:28
@formerlyknownas_463035818 Depends on the question. If this was posted as a collection of some 8-bit entities (char or unsigned char), my answer would be yours pretty much verbatim. The cache locality would be screaming. In short, its *the* answer if the question matched the restriction. Don't think it wouldn't be right; just not to this question, and not because its an interview, but rather because it just doesn't fit well with `int`. That you considered an alternative to traditional sets is good. Don't sell yourself short. Hope that makes sense. — WhozCraig, Jun 04 '19 at 19:33
Use `std::vector` already and be done with it. It'll outperform anything else most of the time. — Jesper Juhl, Jun 04 '19 at 19:35
This can be done with balanced search trees. As the others say, `std::set` is usually a BST. Insert/Erase is O(log(n)), EraseMultiple is O(log(n)+k) (where k is the number of erased elements), Count can be done in O(log(n)), if nodes store the size of child-tree. — geza, Jun 04 '19 at 19:37
@FrançoisAndrieux: I'm not that sure that for 500,000 elements, vector will be faster. I think that the crossover happens much sooner. 500,000 means 250,000 average accesses, 1MB of data. This doesn't fit into the L1 cache. On the other hand, `set` will only need to get ~20 lines (~1KB) into the cache, so it has much lower memory pressure. — geza, Jun 04 '19 at 19:42

score 5 · Answer 1 · answered Jun 04 '19 at 20:56

We cannot know what would make the company happy. If they reject std::vector without concise reasoning I wouldn't want to work for them anyway. Moreover, we dont really know the precise requirements. Were you asked to provide one reasonably well performing implementation? Did they expect you to squeeze out the last percent of the provided benchmark by profiling a bunch of different implementations?

The latter is probably too much for a homework challenge as part of an application process. If it is the first you can either

roll your own. It is unlikely that the interface you were given can be implemented more efficiently than one of the std containers does... unless your requirements are so specific that you can write something that performs well under that specific benchmark.
std::vector for data locality. See eg here for Bjarne himself advocating std::vector rather than linked lists.
std::set for ease of implementation. It seems like you want the container sorted and the interface you have to implement fits that of std::set quite well.

Let's compare only isertion and erasure assuming the container needs to stay sorted:

   operation         std::set          std::vector
   insert            log(N)            N
   erase             log(N)            N

Note that the log(N) for the binary_search to find the position to insert/erase in the vector can be neglected compared to the N.

Now you have to consider that the asymptotic complexity listed above completely neglects the non-linearity of memory access. In reality data can be far away in memory (std::set) leading to many cache misses or it can be local as with std::vector. The log(N) only wins for huge N. To get an idea of the difference 500000/log(500000) is roughly 26410 while 1000/log(1000) is only ~100.

I would expect std::vector to outperform std::set for considerably small container sizes, but at some point the log(N) wins over cache. The exact location of this turning point depends on many factors and can only reliably determined by profiling and measuring.

icodebot · Answer 2 · 2019-06-04T22:44:26.960

The first thing that comes to mind is to hash the integer value so single look ups can be done in constant time.

The integer value can be hashed to compute an index in to an array of bools or bits, used to tell if the integer value is in the container or not.

Counting and and deleting large ranges could be sped up from there, by using multiple hash tables for specific integer ranges.

If you had 0x10000 hash tables, that each stored ints from 0 to 0xFFFF and were using 32 bit integers you could then mask and shift the upper half of the int value and use that as an index to find the correct hash table to insert / delete values from.

IntHashTable containers[0x10000];
u_int32 hashIndex = (u_int32)value / 0x10000;
u_int32int valueInTable = (u_int32)value - (hashIndex * 0x10000);
containers[hashIndex].insert(valueInTable);

Count for example could be implemented as so, if each hash table kept count of the number of elements it contained:

indexStart = startRange / 0x10000;
indexEnd = endRange / 0x10000;

int countTotal = 0;
for (int i = indexStart; i<=indexEnd; ++i) {
   countTotal += containers[i].count();
}

Suggesting to use an hashed container and later switching to binary search is not easy to do, even assuming this could apply to the problem at hand. At least an hint of an implementations should be provided to qualify this as a good answer. — Costantino Grana, Jun 04 '19 at 20:32
I am not suggesting to switch between a hashed container and a binary search. I'm suggesting the use of a binary search that uses a hash table to determine if an integer is present within a specific range. — icodebot, Jun 04 '19 at 21:03
OK I removed the comment I made about binary search, and added a hopefully better explanation to clarify. — icodebot, Jun 04 '19 at 22:52

score 1 · Answer 3 · answered Jun 04 '19 at 20:26

Nobody knows which container is MOST efficient for multiple insertions / deletions. That is like asking what is the most fuel-efficient design for a car engine possible. People are always innovating on the car engines. They make more efficient ones all the time. However, I would recommend a splay tree. The time required for a insertion or deletion is a splay tree is not constant. Some insertions take a long time and some take only a very a short time. However, the average time per insertion/deletion is always guaranteed to be be O(log n), where n is the number of items being stored in the splay tree. logarithmic time is extremely efficient. It should be good enough for your purposes.

score 0 · Answer 4 · answered Jun 04 '19 at 19:32

Not sure if using sorting really is a requirement for removing the range. It might be based on position. Anyway, here is a link with some hints which STL container to use. In which scenario do I use a particular STL container? Just FYI. Vector maybe a good choice, but it does a lot of re allocation, as you know. I prefer deque instead, as it doesn't require big chunk of memory to allocate all items. For such requirement as you had, list probably fit better.

score 0 · Answer 5 · answered Jun 05 '19 at 00:12

Basic solution for this problem might be std::map<int, int> where key is the integer you are storing and value is the number of occurences.

Problem with this is that you can not quickly remove/count ranges. In other words complexity is linear.

For quick count you would need to implement your own complete binary tree where you can know the number of nodes between 2 nodes(upper and lower bound node) because you know the size of tree, and you know how many left and right turns you took to upper and lower bound nodes. Note that we are talking about complete binary tree, in general binary tree you can not make this calculation fast.

For quick range remove I do not know how to make it faster than linear.

Which container is most efficient for multiple insertions / deletions in C++?

5 Answers5