4

I'm considering CQEngine for a project where I need to handle lots of real time events and execute some queries from time to time. It works well for returning the results but I noticed that the larger the collection gets the slower it becomes to add or remove elements to/from it.

I have a few simple indexes added on the collection so I'm assuming the delay is because on each event added/removed the indexes are updated. I also get an OutOfMemoryError on large numbers of events, from the indexes increasing along with the collection I think.

So my question is, what is the indexing penalty in CQEngine for a fast changing collection (elements often added and removed from the collection)?

user2864740
  • 60,010
  • 15
  • 145
  • 220
JohnDoDo
  • 4,720
  • 8
  • 31
  • 47

1 Answers1

5

If you have a lot of unique values in attributes that you index, you would probably benefit from IndexQuantization discussed on the site.

This is a way to tune the tradeoff between memory usage and retrieval speed. But it's especially useful to reduce the size of indexes in memory, if you have a large number of unique values.

FYI you can also ask questions in the CQEngine discussion forum.

Hope that helps!

Inego
  • 1,039
  • 1
  • 12
  • 19
npgall
  • 2,979
  • 1
  • 24
  • 24
  • Thank you for taking the time to respond. Don't know if the IndexQuantization would help since the values I'm indexing, although unique, they are not in sequence. Will give It a try though. Regarding the addition/removal of elements to the collection, I noticed that adding elements is slower if I have indexes on the collection than without indexes. Is that because on each add/remove the index gets reorganized? – JohnDoDo Apr 17 '14 at 18:53
  • It might depend on which type of index you are talking about as you didn't mention. They all use different data structures. But in general, no indexes are not reorganized as each object is added. The HashIndex is basically a ConcurrentHashMap>. So adding objects will either add new entries (a new Set) to the CHM (if objects with the attribute value are already stored) or to the existing Set if already stored. – npgall Apr 17 '14 at 22:43
  • If your values aren't in sequence, you could write your own implementation of the Quantizer interface. For example a simple implementation would mod the attribute value to the number of entries in the index you want (e.g. value % desired_index_size). – npgall Apr 17 '14 at 22:48
  • Also if the values you are indexing uniquely identify an object (that is primary key-type attributes) then you can also try the [UniqueIndex](http://cqengine.googlecode.com/svn/cqengine/javadoc/apidocs/com/googlecode/cqengine/index/unique/UniqueIndex.html). – npgall Apr 17 '14 at 22:52
  • Using an UniqueIndex solved the memory usage (I was using initially HasIndexes which were causing lots of memory usage). For the delays I'm investigating further but will probably post on the forum. Thanks a lot! – JohnDoDo Apr 18 '14 at 08:58