0

I'm currently trying to figure out at mongodb what's the best way in terms of performance cost and redundancy the best way of building a big document data schema. The final JSON from my rest -> app will be likely how it is structured. Now internally the data will not be used as many to many that's why i binded it into a single document. Only the id will be used as a reference in another collections. What you guys think, is it better to spit as relational way, with multiple collection to store the content inside of deliverable and use reference or just embedded. (since NoSQL has no joins i though this way will speed up)

Current using mongoose at node app The Schema:

projectSchema = new Schema({
name: {
    type: String,
    required: true,
    minlength: 3,
    maxlength: 50
},
companyId: {
    type: mongoose.Types.ObjectId,
    ref: 'companies',
    required: true
},
deleted: {
    type: Number,
    enum: [0, 1],
    default: 0
},
predictedStartDate: {
    type: Date,
    default: ""
},
predictedEndDate: {
    type: Date,
    default: ""
},
realStartDate: {
    type: Date,
    default: ""
},
realEndDate: {
    type: Date,
    default: ""
},
//not final version
riskRegister: [{
    name: String,
    wpId: {
        type: mongoose.Types.ObjectId,
        ref: 'projects.deliverables.workPackages.id',
        required: true
    },
    probability: String,
    impact: String,
    riskOwner: String,
    response: String,
    duration: String,
    trigger: String,
    status: String,
    plannedTimming: String
}],
deliverables: [{
    body: String,
    workPackages: [{
        body: String,
        activities: [{
            body: String,
            tasks: [{
               content: String,
               properties: [{
                   dependecies: Array,
                   risk: {
                       type: Number,
                       enum: [0,1],
                       required: true
                   },
                   estimatedTime: {
                       type: Number,
                       required: true
                   },
                   realTime: {
                      required: true,
                      default: 0,
                      type: Number 
                   },
                   responsible: {
                       id: {
                           type: Number,
                           default: -1
                       },
                       type: {
                           type: String,
                           enum: [0, 1], //0 - user, 1 - team
                           default: -1
                       }
                   },
                   materialCosts: {
                       type: Number,
                       default: 0
                   },
                   status: {
                       type: Number,
                       default: 0
                   },
                   approval: {
                       type: Number,
                       default: 0
                   },
                   startDate: {
                       type: Date,
                       default: ""
                   },
                   finishDate: {
                       type: Date,
                       default: ""
                   },
                   endDate: {
                       type: Date,
                       default: ""
                   },
                   userStartDate: {
                        type: Date,
                        default: ""
                   },
                   endStartDate: {
                        type: Date,
                        default: ""
                   },
                   taskNum: {
                       type: Number,
                       required: true
                   },
                   lessonsLearn: {
                    insertedAt: {
                        type: Date,
                        default: Date.now
                    },
                    creatorId: {
                        type: mongoose.Types.ObjectId,
                        ref: 'users',
                        required: true
                    },
                    situation: {
                        type: String,
                        required: true
                    },
                    solution: {
                        type: String,
                        required: true
                    },
                    attachments: Array
                   }
               }] 
            }]
        }]
    }]
}]

})

Enigma
  • 23
  • 1
  • 10

1 Answers1

1

The only concern I would raise would be regarding deliverables. If in the future there is a use case to do some CRUD operation regarding activities or tasks on the workPackage, the mongodb position operator $ does not support inner arrays, so you would be forced to extract all the deliverables and in memory iterate over all and only after update the deliverables. My sugestion would be to support only arrays in the first level on the object. The inner objects should be moduled in separate collection ( activities and tasks ). In latest versions of mongodb you now have support to transactions so you can implement ACID on your operations against database, so the manipulation of all this information can be done in an atomic way.

  • Okay thanks for opinion, indeed i will update tasks frequently and currently i'm using mysql but since our stack is being much based on json, mongo suits better for that task (also some fields may be objects). My current implementation is separed activities, tasks, wp etc but the main problem is the "joins" cost – Enigma Jan 17 '19 at 17:40
  • for more complex joins you will have to use the aggregation framework of mongodb. https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/. If you consider the answer valid can you give it an up vote? – Vitor Paulino Jan 17 '19 at 19:11