12

Im trying to 'compare' all documents between 2 collections, which will return true only and if only all documents inside 2 collections are exactly equal.

I've been searching for the methods on the collection, but couldnt find one that can do this.

I experimented something like these in the mongo shell, but not working as i expected :

db.test1 == db.test2

or

db.test1.to_json() == db.test2.to_json()

Please share your thoughts ! Thank you.

Captain Levi
  • 804
  • 7
  • 18
Bertie
  • 17,277
  • 45
  • 129
  • 182
  • 3
    Using the `db.runCommand('dbHash')` will get you the hashes for your db and collections, of which you could compare a collection hash with another collection hash. Might be easier in knowing if both collections are the same. – KhoPhi Feb 05 '16 at 12:45

1 Answers1

14

You can try using mongodb eval combined with your custom equals function, something like this.

Your methods don't work because in the first case you are comparing object references, which are not the same. In the second case, there is no guarantee that to_json will generate the same string even for the objects that are the same.

Instead, try something like this:

var compareCollections = function(){
    db.test1.find().forEach(function(obj1){
        db.test2.find({/*if you know some properties, you can put them here...if don't, leave this empty*/}).forEach(function(obj2){
            var equals = function(o1, o2){
                // here goes some compare code...modified from the SO link you have in the answer.
            };

            if(equals(ob1, obj2)){
                // Do what you want to do
            }
        });
    });
};

db.eval(compareCollections);

With db.eval you ensure that code will be executed on the database server side, without fetching collections to the client.

Community
  • 1
  • 1
Aleksandar Vucetic
  • 14,715
  • 9
  • 53
  • 56
  • Thanks for the idea. If i understand correctly, this actually has 2 loops, where 1 document in test1 will be tested with all documents in test2 .. ? Or perhaps what you meant is that in the test2.find's argument, we put the obj1's id, because in my case, what's in test1 must be in the test2 with the same id. And also, im quite confused on what if test2 have more documents than test1, or if test1 has more documents than test2, which in my case would mean that test1 and test2 are not equals. Any thoughts on detecting these without looping on both sides of the collections ? Thanks ! – Bertie Feb 11 '12 at 18:11
  • This code goes through both collections and does something when it finds a match from the first collection in the second collection (or you can do something when match is not found, just put if(!equals(...) If you just want to compare if both collections are equal, this can be optimized a lot...for example, before doing db.test1.find you can compare counts of both collections, like db.test1.find().count() == db.test2.find().count()...and if count is not equal, there is no reason to continue. Also, as I pointed in the code, if there is some property you know about (like _id) you (continued...) – Aleksandar Vucetic Feb 11 '12 at 19:00
  • can put it inside db.test2.find ({ ... here ...}) and speed up lookup for second object). So, if your counts are equal, and you never go into if(!equals(...)) then your collections are equal... The important thing is that, at the end, you use db.eval to make sure that your code is executed directly on the server, orherwise, you will end up fetching both collections to the client which can slow down things a lot. – Aleksandar Vucetic Feb 11 '12 at 19:01
  • I think i got the idea ! I'll just compare the length first, just like you said. And i can still go on object to object comparison by finding the match in the second collection and compare it using something like the solution from here : http://stackoverflow.com/questions/1068834/object-comparison-in-javascript .. Thanks a ton ! – Bertie Feb 12 '12 at 01:15