2

I have been playing around with MongoDB and want to key items by domain-name. The problem is that using special characters like period '.' for keys breaks Mongo with the error:

Error: key www.google.com must not contain '.'

For example, I want to be able to store:

stupidObject = {
    'www.google.com': {
        '8.8.8.8': 'Other info',
        '8.8.4.4': ['item1', 'item2', ... , 'itemN']
        },
    'www.othersite.com': {
        '8.8.8.8': 'Other info',
        '8.8.4.4': ['item1', 'item2', ... , 'itemN']
        },
    }

All of the solutions I've seen are some variant of: change the key before saving, use Unicode representation, hash the key before saving. e.g. see answer: MongoDB dot (.) in key name

All of these solutions cause their own problems and make it hard to maintain the code. The responsibility is on the programmer to remember to do this filtering and to do it consistently. This is a TERRIBLE solution.

I though about hashing but collisions are a risk (which would be near impossible to debug) and again placing responsibility on the programmers. Imagine the impact these solutions will have on international developer teams.

My question is simple: What is the correct way to do this in MongoDB?

I ended up with a custom solution where I recursively (alarm bells!) navigate the structure and replace the special characters. This in done in the Mongoose Schema by leveraging pre('save') and post('find') hooks.

This means that the programmers don't have to care about the special characters they use keys with domain names they save and the database layer takes care of everything transparently. This seems like a better solution to me.

However... This requires some messy code to get around the issues of a Mongoose object misbehaving when using hasOwnProperty and the requirement to run '.toObject()' first and then pass the original 'this' pointer by reference.

This solution works but I figured there must be a better way! Any thoughts or guidance on the right way to do this is gratefully accepted! When you see the code below you'll realise why I think there must be a better way!

I should mention I didn't want to install any libraries or have other dependencies in order to solve this problem.

Here's an example of the code used:

// Recursive function to replace character||string in keys that may cause violations
// Same code can be used to reverse the change
//
var replaceStringInKeys = function (stringToReplace, newString, regExp, thisObj, thisPtr) {
    for(property in thisObj) {
        if (thisObj.hasOwnProperty(property)) {
            if(property.indexOf(stringToReplace) > -1) {
                // Replace the '.'s with URL escaped version. Delete old object.
                var newproperty = property.replace(regExp, newString);
                thisObj[newproperty] = thisObj[property];
                thisPtr[newproperty] = thisPtr[property];
                delete thisObj[property];
                delete thisPtr[property];
                // Pass the new property too
                if (thisObj[newproperty].constructor === Object) {
                    thisObj[newproperty] = replaceStringInKeys(stringToReplace, newString, regExp, thisObj[newproperty], thisPtr[newproperty]);
                    thisPtr[newproperty] = thisObj[newproperty];
                }
                continue;
            }
            if (thisObj[property].constructor === Object) {
                thisObj[property] = replaceStringInKeys(stringToReplace, newString, regExp, thisObj[property], thisPtr[property]);
                thisPtr[property] = thisObj[property];
            }
        }
    }
    return thisObj;
};

testSchema.pre('save', function(next) {
    // Calling '.toObject' allows for hasOwnProperty to work
    var thisObj = this.toObject();
    console.log('Pre save record...');
    // Duplicate the this pointer as mongo is too shit to use hasOwnProperty properly
    replaceStringInKeys('.', '[whateveryouwantinsteadofdot]', /\./g, thisObj, this);
    next();
});

testSchema.post('find', function(results) {
    console.log('post find record...');
    // Undo the changes made by the pre-save hook
    var i;
    for(i = 0; i < results.length; i++) {
        var thisObj = results[i].toObject();
        replaceStringInKeys('[whateveryouwantinsteadofdot]', '.', /\[whateveryouwantinsteadofdot\]/g, thisObj, results[i]);
    }
});

Note: Be careful using this solution (if you're crazy enough to) as there may be security issues. For example, if a bad guy knows you replace '.' with %2E and they can force the use of e.g.

hxxp://www.vulnerablesitethatdoesntexist.com/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/%2E%2E/etc/passwd

which is correctly escaped but would be transparently translated to a directory traversal type string:
hxxp://www.vulnerablesitethatdoesntexist.com/../../../../../../../../../../../etc/passwd

misterfitzy
  • 77
  • 1
  • 9
  • see this answer http://stackoverflow.com/questions/40542336/mongodb-insert-key-with-dollar/40542724#40542724 – sergiuz Apr 25 '17 at 10:26
  • Hi Sergiu, thanks for the link. I had read that answer before. This solution requires wholesale changing of backend data which is rarely a good idea. You could do this substitution via pre and post hooks as above but the solution would be the same as above. And you run the risk of changing data to something that looks like the character but isn't the character i.e. a homograph. This can lead to issues where developers cannot identify why a query is failing if they look at the data via a terminal that renders Unicode. – misterfitzy Apr 25 '17 at 11:15

1 Answers1

0

You should change your document's structure to void using dots as keys. I ran into the same issue years ago.

yourStupidObject = {
  'www.google.com': [
    {'ip': '8.8.8.8', more: 'Other info',
    {'ip': '8.8.4.4', more: ['item1']}
    ]
}
Thomas R. Koll
  • 3,131
  • 1
  • 19
  • 26
  • Hi Thomas, thanks for your comment. After thinking about your answer it's probably the "right" way to do things. However, part of what I wanted to check was keeping JSON structure i.e. frontend object and MongoDB backend the same. In some cases it would be more convenient to simply save the frontend object if possible. – misterfitzy Apr 25 '17 at 11:47
  • Just think how many hours you already sunk into this and how much work changing the front end would be in comparison – Thomas R. Koll Apr 25 '17 at 14:03
  • You're right but it's working as I need it for me at the moment. I just wondered what the correct way to do this is? For my use case it was worth figuring this out as it makes the end-to-end that bit easier. – misterfitzy Apr 25 '17 at 15:16
  • 1
    I'm accepting this as the answer as arguably the correct way to do is to simplify/change the schema. There's been no other activity within the week. However, if anyone wants to abstract changes on read and write they can use some variation of the code I posted. – misterfitzy May 09 '17 at 23:52