If you really want to squeeze this down, there is a dumb/simple approach for your case (show top 1%).
This optimization can happen because on average, only 1 out of 100 popularity changes will knock out one of the top 1%. (Assumes a random distribution of updates. Of course with a more typical power-law distribution, this could happen much more frequently.)
Sort the entire initial collection,
- Store only the top 10 in any sorted data structure (e.g. BST)
- Store the popularity score of #10 (e.g.
minVisiblePopularity
)
then with each subsequent popularity change in the collection, compare with minVisiblePopularity
.
- If the new popularity falls above the
minVisiblePopularity
, update the top-10 structure and minVisiblePopularity
accordingly.
- (Or if the old popularity was greater, but new popularity is less - e.g. former top 10 item getting knocked out).
This adds a minimal storage requirement of an extremely small binary search tree (10 items) and a primitive variable. The tree then only requires updating when a popularity change knocks out one of the previous top-10.