0

I'm working to implement a database to provide data to some decisions support tools for agriculture. Well, I've been learning about mongoDB and its best applications practices (blog.engineyard.com/2011/mongodb-best-practices, http://s3.amazonaws.com/info-mongodb-com/MongoDB-Performance-Best-Practices.pdf) but I'm still having some questions about the best way to design the database. Actually, I see many different possibilities, even here (stackoverflow.com/questions/5373198/mongodb-relationships-embed-or-reference).

My database needs to store a big quantity of data. I'm gonna have a lot of weather stations where each one has a lot of data. You can see my schemes below.

This is to store the weather stations. I've created one object called location that contains country, state, city and county. Sometimes I need to find stations by country, sometimes by state and so on...

var weatherStationSchema = new mongoose.Schema({
    name: String,

    active: {
       type: Boolean,
       default: true
    },

    private: { 
       type: Boolean,
       default: false
    },

    organization: {
       type: mongoose.Schema.ObjectId,
       ref: 'Organization'
    },

    location: {
       country: {
           type: mongoose.Schema.ObjectId,
           ref: 'Country',
           required: true
       },
       state: {
           type: mongoose.Schema.ObjectId,
           ref: 'State',
           required: true
       },
       city: {
           type: mongoose.Schema.ObjectId,
           ref: 'City'
       },
       county: {
           type: mongoose.Schema.ObjectId,
           ref: 'County'
       },

       lat: Number,
       lon: Number,

       utcOffset: Number,
       zoneName: String
    },

    metaData: {},
    __v: {
        type: Number, 
        select: false
   }

});


var WeatherStation = mongoose.model('WeatherStation', weatherStationSchema);

So, this is my data's document scheme. What is the problem? I'm gonna have a lot of data inside this document. I did some tests and I realize that going through this way. I'll be able to store 106.000 records (Understanding MongoDB BSON Document size limit).

var wsDataSchema = new mongoose.Schema({
    weatherStation: {
        type: mongoose.Schema.ObjectId,
        ref: 'WeatherStation'
    },
    datetime: 'Moment',
    utcOffset: Number,
    interval: Number,
    data: {
        minTemp: Number,
        avgTemp: Number,
        maxTemp: Number,
        avgHumi: Number,
        winSpeed: Number,
        SolarRad: Number
    },
    __v: {
        type: Number, 
        select: false
   }
});

var WsData = mongoose.model('WsData', wsDataSchema);

To solve this, I've been looking for solutions and trying to understand how is the best way to implement this kind of database with MongoDB and I'm studying about gridfs (https://www.mongodb.com/blog/post/building-mongodb-applications-binary-files-using-gridfs-part-1?_ga=1.247329284.1472580275.1437575537).

So, my question is: Should I use something like gridFS or I'm doing all wrong or... How can I work with this kind of data that exceeds the 16MB file limit.

Sorry about my this big and boring text.

Thanks all!

Community
  • 1
  • 1
  • 2
    I don't understand your problem about the size limit, which is on the single document, but every measure of every station would be a different document within the collection "WsData". Your model is correct and you could fit billions of measures. Anyway GridFS is only for storing large binary-like files, like "LOB" fields in relational database: it's useless for you. – dbra Sep 01 '15 at 16:52
  • Man, I realize now that probably I'm doing a huge stupid mistake. The point is that my "WsData" is a collection (and not a unique document as I was thinking) with many documents, right? In this way.. do you think that would be better to put the "WsData" inside the WeatherStations as a new field instead of use a different scheme? Really appreciate your help. – Cerbaro Sep 01 '15 at 19:30
  • No, don't do it! Otherwise you'll put yourself in the limit you originally pointed out. Furthermore, there's no real way to quickly search/access a single value inside a subdocument. On other hands, even the dedicated WsData collection would become slow year after year, when measure would become hundreds of millions. There are specific schema design for your case, one of them is explained within the official webinar "MongoDB Schema Design and Performance Implications" if my memory isn't failing. Anyway I'm not sure it would fit into mongoose, so go ahead with the above design for now. – dbra Sep 01 '15 at 20:10
  • I got it. So far so good! Also I've found the link that you mentioned (https://www.mongodb.com/presentations/webinar-mongodb-schema-design-and-performance-implications) and I'll stop to watch this video now. Thank you so much dbra. Really appreciate. Cheers! – Cerbaro Sep 01 '15 at 20:59

0 Answers0