I have been playing around with MongoDB and want to key items by domain-name. The problem is that using special characters like period '.' for keys breaks Mongo with the error:
Error: key www.google.com must not contain '.'
For example, I want to be able to store:
stupidObject = {
'www.google.com': {
'8.8.8.8': 'Other info',
'8.8.4.4': ['item1', 'item2', ... , 'itemN']
},
'www.othersite.com': {
'8.8.8.8': 'Other info',
'8.8.4.4': ['item1', 'item2', ... , 'itemN']
},
}
All of the solutions I've seen are some variant of: change the key before saving, use Unicode representation, hash the key before saving. e.g. see answer: MongoDB dot (.) in key name
All of these solutions cause their own problems and make it hard to maintain the code. The responsibility is on the programmer to remember to do this filtering and to do it consistently. This is a TERRIBLE solution.
I though about hashing but collisions are a risk (which would be near impossible to debug) and again placing responsibility on the programmers. Imagine the impact these solutions will have on international developer teams.
My question is simple: What is the correct way to do this in MongoDB?
I ended up with a custom solution where I recursively (alarm bells!) navigate the structure and replace the special characters. This in done in the Mongoose Schema by leveraging pre('save') and post('find') hooks.
This means that the programmers don't have to care about the special characters they use keys with domain names they save and the database layer takes care of everything transparently. This seems like a better solution to me.
However... This requires some messy code to get around the issues of a Mongoose object misbehaving when using hasOwnProperty and the requirement to run '.toObject()' first and then pass the original 'this' pointer by reference.
This solution works but I figured there must be a better way! Any thoughts or guidance on the right way to do this is gratefully accepted! When you see the code below you'll realise why I think there must be a better way!
I should mention I didn't want to install any libraries or have other dependencies in order to solve this problem.
Here's an example of the code used:
// Recursive function to replace character||string in keys that may cause violations
// Same code can be used to reverse the change
//
var replaceStringInKeys = function (stringToReplace, newString, regExp, thisObj, thisPtr) {
for(property in thisObj) {
if (thisObj.hasOwnProperty(property)) {
if(property.indexOf(stringToReplace) > -1) {
// Replace the '.'s with URL escaped version. Delete old object.
var newproperty = property.replace(regExp, newString);
thisObj[newproperty] = thisObj[property];
thisPtr[newproperty] = thisPtr[property];
delete thisObj[property];
delete thisPtr[property];
// Pass the new property too
if (thisObj[newproperty].constructor === Object) {
thisObj[newproperty] = replaceStringInKeys(stringToReplace, newString, regExp, thisObj[newproperty], thisPtr[newproperty]);
thisPtr[newproperty] = thisObj[newproperty];
}
continue;
}
if (thisObj[property].constructor === Object) {
thisObj[property] = replaceStringInKeys(stringToReplace, newString, regExp, thisObj[property], thisPtr[property]);
thisPtr[property] = thisObj[property];
}
}
}
return thisObj;
};
testSchema.pre('save', function(next) {
// Calling '.toObject' allows for hasOwnProperty to work
var thisObj = this.toObject();
console.log('Pre save record...');
// Duplicate the this pointer as mongo is too shit to use hasOwnProperty properly
replaceStringInKeys('.', '[whateveryouwantinsteadofdot]', /\./g, thisObj, this);
next();
});
testSchema.post('find', function(results) {
console.log('post find record...');
// Undo the changes made by the pre-save hook
var i;
for(i = 0; i < results.length; i++) {
var thisObj = results[i].toObject();
replaceStringInKeys('[whateveryouwantinsteadofdot]', '.', /\[whateveryouwantinsteadofdot\]/g, thisObj, results[i]);
}
});
Note: Be careful using this solution (if you're crazy enough to) as there may be security issues. For example, if a bad guy knows you replace '.' with %2E and they can force the use of e.g.
hxxp://www.vulnerablesitethatdoesntexist.com/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/etc/passwd
which is correctly escaped but would be transparently translated to a directory traversal type string:
hxxp://www.vulnerablesitethatdoesntexist.com/../../../../../../../../../../../etc/passwd