What are the recommended best practices for a MongoDB / Mongoose schema to store large amounts of user data in a secure manner.
Each User model needs the usual fields (name, email, etc) and each user could have a large number of associated Content records. (This of a note taking app like Evernote.) Each Content document needs the usual metadata such as date created and updated as well as text content (subject, body) and perhaps binary attachments (which for the sake of this question we can assume will be stored outside of the database so we only need to store a file locator). The body text could be quite large.
Ideally, each User's Content is stored in a separate location from any other user's data. I want to avoid any possibility that a simple mistake in a database query could expose on user's content to another. Also, when a user wants to have their data deleted we need to make sure all their data is removed and only their data. Furthermore, someday the data will need to be sharded by user. I think this means each user's content needs to be in a separate Document or Collection.
Option 1, by reference doesn't achieve these goals yet it is the most common example I've seen:
var mongoose = require('mongoose');
var UserSchema = new mongoose.Schema({
email: { type: String, required: true },
name: { given: String, family: String },
content: [{ type: Schema.Types.ObjectId, ref: 'Content' }],
...
});
var ContentSchema = new mongoose.Schema({
_id: Number,
subject: String,
encryptedBody: String,
...
});
module.exports = mongoose.model('User', UserSchema);
module.exports = mongoose.model('Content', ContentSchema);
Option 2, embed the content into the User schema
var mongoose = require('mongoose');
var UserSchema = new mongoose.Schema({
email: { type: String, required: true },
name: { given: String, family: String },
...
content : [{
subject: String,
encryptedBody: String,
...
}]
});
module.exports = mongoose.model('User', UserSchema);
Is there another way?
Is one better than the other regarding performance? (Assume reads far out weigh the number of writes.)
Any thoughts about indexing the Content Subject field?
Thanks in advance for your thoughts.