2

I am looking for a way to update every document in a collection called "posts".

Posts get updated periodically with a popularity (sitewide popularity) and a strength (the estimated relevance to that particular user), each from different sources. What I need to do is multiply popularity and strength on each post to get a third field, relevance. Relevance is used for sorting the posts.

class Post
  include Mongoid::Document

  field :popularity
  field :strength
  field :relevance
  ...

The current implementation is as follows:

1) I map/reduce down to a separate collection, which stores the post id and calculated relevance.

2) I update every post individually from the map reduce results.

This is a huge amount of individual update queries, and it seems silly to map each post to its own result (1-to-1), only to update the post again. Is it possible to multiply in place, or do some sort of in-place map?

Matt McCormick
  • 582
  • 1
  • 4
  • 21
  • This might be a duplicate if my solution here works: http://stackoverflow.com/a/8230759/131227 – Mark Bolusmjak Feb 01 '12 at 19:50
  • Thanks for the link. I am hesitant to overwrite entire documents in the original collection, because the post collection is updated periodically as users take actions on the site, and I'd be worried about concurrency. – Matt McCormick Feb 01 '12 at 21:25

1 Answers1

0

Is it possible to multiply in place, or do some sort of in-place map?

Nope.

The ideal here would be to have the Map/Reduce update the Post directly when it is complete. Unfortunately, M/R does not have that capability. In theory, you could issue updates from the "finalize" stage, but this will collapse in a sharded environment.

However, if all you are doing is a simple multiplication, then you don't really need M/R at all. You can just run a big for loop, or you can hook up the save event to update :relevance when :popularity or :strength are updated.

MongoDB doesn't have triggers, so it can't do this automatically. But you're using a business layer which is the exact place to put this kind of logic.

Gates VP
  • 44,957
  • 11
  • 105
  • 108
  • Thanks for the answer. However, posts are actually user-specific, so that the same actual content may be pushed to many user feeds, but with info specific to that user (strength is user-specific). Both strength and popularity are updated using atomic updates, so a before_save won't work. While I could run a big for loop, this involves two queries (one to get and one to save) for each post, whereas with M/R I have one M/R, then one atomic update for each result. – Matt McCormick Feb 01 '12 at 21:21