1

I want to code a little app where I can store the incoming request url, named reqUrl below, and check if it already exists by using the compareUrls function.

It returns true if both websites are in the same domain and false otherwise, for example when doing compareUrls(stackoverflow.com, http://www.stackoverflow.com). This is used so as not to add duplicate urls.

I am trying to use that function inside a MongoDB query like this:

app.get("/:reqUrl", function(req, res)
{
    var reqUrl = req.params.reqUrl;

    MongoClient.connect(Url, function(err, db)
     { 
       if (err) throw err;  
       db.collection("mydb").find({$where: function() {

         if (compareUrls(reqUrl, this.url) //if true, simply return the url
         {
            return this.url;
         } else { //if not existing insert it into the database
            db.collection("mydb").insert({"url":reqUrl});
         };          

     }}).toArray();

//Code continues below

Now the problem is that because of scoping, the reqUrl variable is not recognized, and I don't know any workaround. And even when using local variables with compareUrls I get back the whole collection of elements. I thought about retrieving back all results to an array by simply calling .find and checking reqUrl against each item, but that would be far more than efficient.

Please note that I am very new to MongoDB.

Any feedback would be appreciated, thanks.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317

1 Answers1

2

The bottom line here is that you cannot perform other database operations inside the logic of a $where clause, nor should you since it is completely unnecessary and your actions are actually supported in existing standard operators and methods.

What you really want here is .findOneAndUpdate(). You do not need $where for the sort of match condition you are doing which is to simply check a value. This is actually a $regex search condition for the "query" portion to select.

As for the "insert" part, then that is what "upserts" are for. So when the data is not "found", then the "upsert" creates/inserts the new document in the collection, otherwise when found it "updates". You can tune that in this case with the $setOnInsert modifier so that a "found" document is not actually modified, and the data is only touched on "insertion":

db.collection("mydb").findOneAndUpdate(
  { "url": new RegExp(reqUrl) },
  { "$setOnInsert": { "url": reqUrl } },
  { "upsert": true, "returnOriginal": false },
  function(err, doc) {
    // deal with result here
  }
)

Of course the $regex usage here is just a basic "is this string present in the properties string" condition. There are more advanced regular expressions specific to "domain matching", such as you could find in the existing answers here: Regex to match simple domain

But the basic logic remains the same that a "regular expression" does the match condition and then you simply "upsert".

That said, there is nothing actually stopping you from using a $where clause for the match condition. It's just that the actual operation remains an "upsert" instead of trying to call a database method "within" the supplied function which can either call a server function or be included inline:

db.collection("mydb").findOneAndUpdate(
  { "$where": function() { return compareUrls(reqUrl, this.url); }  },
  { "$setOnInsert": { "url": reqUrl } },
  { "upsert": true, "returnOriginal": false },
  function(err, doc) {
    // deal with result here
  }
)

Just make sure that under the conditions of $where the server function or any result is actually returning a boolean true/false, since that is how $where operates.

Also note the usage of "returnOriginal": false here, as the default behavior of the .findOneAndUpdate() method is to return the "original" document before modification. In some cases this would be desired, but most common usage is to return the document in it's modified state.

Of course if you do not need the document in response at all, then .updateOne() will suffice as a method, and reduces the overhead of returning the document content "over the wire".

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
  • Thanks a lot for your detailled answer, I thought before about going the regex route, but I wanted absolutely to use the `compareUrls` function. Too bad that it is difficult to use custom functions inside MongoDB queries –  Jun 06 '17 at 02:03
  • 1
    @Valilutzik It's very much by design. And not "too bad" at all. I cannot imagine there is anything happening in your function that cannot be done in a regular expression. In fact to be "really optimal" you probably should be storing the "domain" only as a property if that is your test of uniqueness, then this becomes a simple "equality" match and the most performant option. But what you apparent really **need** to wrap your head around is understanding that **JavaScript Evaluation === BAD** in terms of performance and whole host of reasons. Use the native operators. – Neil Lunn Jun 06 '17 at 02:10
  • Okay, got it :) –  Jun 06 '17 at 02:14
  • @Valilutzik Ran off to lunch in the middle of that, but what I also meant to say is that there is nothing actually stopping you using a `$where` clause for the "selection" logic, it's just that you probably should not if you can avoid it. The real issue is that your presumption on using `.insert()` as a database method here is incorrect. Example added in the answer with explanation. Which should be useful information you do not seem to know. – Neil Lunn Jun 06 '17 at 03:22
  • I actually kept thinking about your previous comment about design and performance and things are starting to make sense now. I'll reflect on your updated answer. Thanks again a ton for your help –  Jun 06 '17 at 03:38