Advisability: Mongoid and Aggregate Functions

Question

I have an app and I'm early enough in the design to walk back the database choice. I'd like to use MongoDB but here is where I'm running into potential issues. I will be doing averaging frequently. Consider this case:

A trip leg is a certain number of miles long
A trip leg consumes a given amount of fuel
The average fuel economy (computed) is a simple division of miles / gallons
A more interesting statistic is the average economy of everyone doing the same leg
Another interesting statistic is the average economy of everyone doing a leg near specified start and end points

The last point involves a map/reduce across a query obtaining the total number of miles driven and dividing by the total number of gallons consumed. Is this going to make my server melt down?

I'm using Mongoid in a Rails app. Is there any friction I'm injecting here or will it work just fine to streamline common use-cases like insert, delete, update, query?

The other candidate database is Postgres, which also handles location data, but it is not schema less in the way Mongo is.

I recognize that some of this calls for opinion, but perhaps this is information that would benefit SO users.

Thanks!

Do you have time to wait for MongoDB to do MapReduce? It's a batch function that should be run in the background. MapReduce is not a realtime function: http://stackoverflow.com/questions/3947889/mongodb-terrible-mapreduce-performance — mttdbrd, May 02 '14 at 17:46
Background computation would kind of mess up the idea of the app, which is to provide users with a complete answer to their question: "what kind of mileage are people getting on drives between these points?" It seems there's a mismatch between MongoDB and what I'm driving at here unless I'm misunderstanding you @mttdbrd. — Steve Ross, May 02 '14 at 18:07
Well, I don't think MapReduce is going to work for you. Basically the idea of MapReduce is that you run it say once a day and for the rest of the next day, query the results of the operation. It's not a real-time operation, or it's not designed to be. It may be fast for (very) small data sets, but it's designed for large scale data processing. See here: https://en.wikipedia.org/wiki/MapReduce#Performance_considerations — mttdbrd, May 02 '14 at 18:11
Actually, re-reading your post, you should be able to use MongoDB. The average fuel-efficiency won't change much from day to day, so you should be able to just generate a new fuel-efficiency MapReduce every day or every couple of days in the background. — mttdbrd, May 02 '14 at 18:13
would you be able to use the aggregation pipeline with $near, see http://docs.mongodb.org/manual/reference/operator/aggregation/geoNear/ rather than map/reduce? The spatial query has to be the first in the pipeline, but that would seem to fit with your use case. You can't shard on a geospatial field, but then you can't in Postgres/Postgis either -- 2d indexes raise particular problems in this regard. — John Powell, May 03 '14 at 07:51

Advisability: Mongoid and Aggregate Functions

0 Answers0