Remove field found in any mongodb array

Question

I have a document looking like this:

{
field: 'value',
field2: 'value',
scan: [
    [
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
    ],
    [
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
    ],
    [
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
        {
            dontDeleteMe: 'keepMe',
            arrayToDelete: [0,1,2]
        },
    ],

]

}

We only want to delete any instance of the arrayToDelete that is in a dictionary nested in a list of lists in "scan".

Currently, I've been using

 update({}, {$unset: {"scans.0.0.arrayToDelete":1}}, {multi: true})

to delete the array. However, this only deletes the first (0.0) as you would imagine.

Is it possible to iterate over the scan array and the nested array to delete "arrayToDelete", and keep all other fields?

Thanks

That's not a valid data structure. You cannot have index values presented like that directly within the `[]` elements. I suggest you use the `mongo` shell to retrieve and represent what your data actually looks like. I'm just going to presume it's actually like `{ scan: [[{ arrayToDelete: [0,1,2] },{ arrayToDelete: [0,1,2] }],[{ arrayToDelete: [0,1,2] },{ arrayToDelete: [0,1,2] },],...] }` But you really should correct it here to how it actually looks. — Neil Lunn, Apr 01 '19 at 07:09
So is there anything else actually in the array that needs to stay there? Or can `"scan"` basically be set to `[]`? Because that would be the shortest path? If not that, then do you mean **remove the array element** that contains the `arrayToDelete`? Or do you just want to remove that named field from within the object inside the array, and inside the other array? That could be made clearer. — Neil Lunn, Apr 01 '19 at 07:34
@NeilLunn I've updated the q. to make it more clear. We only want to delete any instance of the arrayToDelete that is in a dictionary nested in a list of lists in "scan". — Dave, Apr 01 '19 at 08:34
@Priyank That didn't seem to work - using the code in the question it will only delete the first dictionary in the first list, for each document in the db. — Dave, Apr 01 '19 at 08:37
@Dave I kind of could not wait and included something similar to your edit in my answer. Also answered a couple of different cases there, so your actual question appears to be **Case 2**. But now at least you have the others for reference of what to do for those as well. — Neil Lunn, Apr 01 '19 at 08:44

Neil Lunn · Accepted Answer · 2019-04-02T09:37:49.680

So I asked a question in the comments but you seem to have walked away, so I guess I just answer the three possible cases I see.

To begin with, I'm not certain if the elements shown within the nested arrays are the only elements within the array or in fact if arrayToDelete is the only field present in those elements. So basically I need to abstract a little and include that case:

{
    field: 'value',
    field2: 'value',
    scan: [
        [
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {   somethingToKeep: 1 },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
        ],
        [
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {   somethingToKeep: 1 },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
        ],
        [
            {   somethingToKeep: 1 },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
            {
                arrayToDelete: [0,1,2],
                anotherField: "a"
            },
        ],
    ]
}

Case 1 - Remove the inner array elements where the Field is present

This would use the $pull operator since that is what removes array elements entirely. You do this in modern MongoDB with a statement like this:

db.collection.updateMany(
  { "scan": {
    "$elemMatch": {
      "$elemMatch": {
        "arrayToDelete": { "$exists": true }
      }
    }
  } },
  {
    "$pull": {
      "scan.$[a]": { "arrayToDelete": { "$exists": true } }
    }
  },
  { "arrayFilters": [
      {  "a": { "$elemMatch": { "arrayToDelete": { "$exists": true } } } }
    ]
  }
)

That alters all matching documents like this:

{
        "_id" : ObjectId("5ca1c36d9e31550a618011e2"),
        "field" : "value",
        "field2" : "value",
        "scan" : [
                [
                        {
                                "somethingToKeep" : 1
                        }
                ],
                [
                        {
                                "somethingToKeep" : 1
                        }
                ],
                [
                        {
                                "somethingToKeep" : 1
                        }
                ]
        ]
}

So every element that contained that field is now removed.

Case 2 - Just remove the matched field from the inner elements

This is where you use $unset. It's just a little different to the "hard indexed" version you were doing:

db.collection.updateMany(
  { "scan": {
    "$elemMatch": {
      "$elemMatch": {
        "arrayToDelete": { "$exists": true }
      }
    }
  } },
  { "$unset": { "scan.$[].$[].arrayToDelete": ""  } }
)

Which alters all matched documents to be:

{
        "_id" : ObjectId("5ca1c4c49e31550a618011e3"),
        "field" : "value",
        "field2" : "value",
        "scan" : [
                [
                        {
                                "anotherField" : "a"
                        },
                        {
                                "somethingToKeep" : 1
                        },
                        {
                                "anotherField" : "a"
                        },
                        {
                                "anotherField" : "a"
                        },
                        {
                                "anotherField" : "a"
                        }
                ],
                [
                        {
                                "anotherField" : "a"
                        },
                        {
                                "anotherField" : "a"
                        },
                        {
                                "anotherField" : "a"
                        },
                        {
                                "somethingToKeep" : 1
                        },
                        {
                                "anotherField" : "a"
                        }
                ],
                [
                        {
                                "somethingToKeep" : 1
                        },
                        {
                                "anotherField" : "a"
                        },
                        {
                                "anotherField" : "a"
                        },
                        {
                                "anotherField" : "a"
                        },
                        {
                                "anotherField" : "a"
                        }
                ]
        ]
}

So everything is still there, but just the identified fields have been removed from each inner array document.

Case 3 - You Actually wanted to remove "Everything" in the array.

Which is really just a simple case of using $set and wiping everything that was there before:

db.collection.updateMany(
  { "scan": {
    "$elemMatch": {
      "$elemMatch": {
        "arrayToDelete": { "$exists": true }
      }
    }
  } },
  { "$set": { "scan": []  } }
)

Where the results should pretty well be expected:

{
        "_id" : ObjectId("5ca1c5c59e31550a618011e4"),
        "field" : "value",
        "field2" : "value",
        "scan" : [ ]
}

So what are these all doing?

The very first thing you should see is the query predicate. This is generally a good idea to make sure you are not matching and even "attempting" to have update conditions met on documents which do not even contain data with the pattern you are intending to update. Nested arrays are difficult at best, and where practical you really should avoid them, as what you often "really mean" is actually represented in a singular array with additional attributes representing what you "think" the nesting is actually doing for you.

But just because they are hard does not mean impossible. It's just that you need to understand $elemMatch:

db.colelction.find(
  { "scan": {
    "$elemMatch": {
      "$elemMatch": {
        "arrayToDelete": { "$exists": true }
      }
    }
  }}
)

That's the basic find() example, which matches based on the $elemMatch condition for the outer array uses another $elemMatch in order to match another condition in the inner array. Even though this "appears" to be a singular predicate. Something like:

"scan.arrayToDelete": { "$exists": true }

Just will not work. Neither would:

"scan..arrayToDelete": { "$exists": true }

With the "double dot" .. because that's basically just not valid.

That's the query predicate to match "documents" that need to be processed, but the rest applies to actually determine *what parts to update".

In the Case 1 in order to $pull from the inner array, we first need to be able to identify which elements of the outer array contain the data to update. That's what the "scan.$[a]" thing is doing using the positional filtered $[<identifier>] operator.

That operator basically transposes the matched indices ( so many of them ) in the array to another predicate which is defined in the third section of the update style commands with the arrayFilters section. This section basically defines the conditions to be met from the perspective of the named identifier.

In this case out "identifier" is named a, and that is the prefix used in the arrayFilters entry:

  { "arrayFilters": [
      {  "a": { "$elemMatch": { "arrayToDelete": { "$exists": true } } } }
    ]
  }

Taken in context with the actual update statement part:

  {
    "$pull": {
      "scan.$[a]": { "arrayToDelete": { "$exists": true } }
    }
  },

Then from the perspective of the "a" being the identifier for the outer array element first inward from "scan", then same conditions apply as for the original query predicate but from "within" the first $elemMatch statement. So you can basically think of this as a "query within a query" from the perspective of already "looking inside" the content of each outer element.

By the same token the $pull acts much like a "query within a query" in that it's own arguments are also applied from the perspective of the element of the array. Therefore just the arrayToDelete field existing instead of:

  // This would be wrong! and do nothing :(
  {
    "$pull": {
      "scan.$[a]": { "$elemMatch": { "arrayToDelete": { "$exists": true } } }
    }
  }

But that's all specific to $pull, and other things have different cases:

The Case 2 looks at where you want to just $unset the named field. Seems pretty easy as you just name the field, right? Well not exactly since the following is clearly not right from what we know earlier:

  { "$unset": { "scan.arrayToDelete": ""  } } // Not right :(

And of course noting array indexes for everything is just a pain:

  { "$unset": { 
    "scan.0.0.arrayToDelete": "",
    "scan.0.1.arrayToDelete": "",
    "scan.0.2.arrayToDelete": "",
    "scan.0.3.arrayToDelete": "",  // My fingers are tired :-<
  } }

This is the reason for the positional all $[] operator. This one is a little more "brute force" than the positional filtered $[<identifier>] in that instead of matching another predicate provided within arrayFilters, what this simply does is apply to everything within the array contents at that "index". It's basically a way of in fact saying "all indexes" without typing every single one out like the horrible case shown above.

So it's not for all cases, but it's certainly well suited to an $unset since that has a very specific path naming which does not matter of course if that path does not match every single element of the array.

You could still use an arrayFilters and a positional filtered $[<identifier>], but here it would be overkill. Plus it does not hurt to demonstrate the other approach.

But of course it probably is worth understanding how exactly that statement would look, so:

db.collection.updateMany(
  { "scan": {
    "$elemMatch": {
      "$elemMatch": {
        "arrayToDelete": { "$exists": true }
      }
    }
  } },
  { "$unset": { "scan.$[a].$[b].arrayToDelete": ""  } },
  {
    "arrayFilters": [
      { "a": { "$elemMatch": { "arrayToDelete": { "$exists": true } } } },
      { "b.arrayToDelete": { "$exists": true } },
    ]
  }
)

Noting there that the "b.arrayToDelete" may not be what you expect at first, but given the positioning in "scan.$[a].$[b] it really should make sense as from the b the element name would be reached via "dot notation" just as shown. And in fact in both cases. Yet again, an $unset would only apply to the named field anyway, so the selection criteria really is not required.

And Case 3. Well it's quite simple in that if you don't need to keep anything else in the array after removing this content ( ie a $pull where fields matching this were the only things in there, or an $unset for that matter ), then simply don't mess around with anything else and just wipe the array.

This is an important distinction if you consider that as per the point to clarify whether the documents with the named field where the only elements within the nested arrays, and indeed that the named key was the only thing present in the documents.

With the reasoning being that using $pull as shown here and under those conditions you would get:

{
        "_id" : ObjectId("5ca321909e31550a618011e6"),
        "field" : "value",
        "field2" : "value",
        "scan" : [
            [ ],
            [ ],
            [ ]
        ]
}

Or with the $unset:

{
        "_id" : ObjectId("5ca322bc9e31550a618011e7"),
        "field" : "value",
        "field2" : "value",
        "scan" : [
            [{ }, { }, { }, { }],
            [{ }, { }, { }, { }],
            [{ }, { }, { }, { }]
        ]
}

Both of which are clearly not desirable. So it stands yo reason if the arrayToDelete field was the only content that was in there at all, then the most logical way to remove all is simply to replace the array with an empty one. Or indeed $unset the whole document property.

Note however that all these "fancy things" ( with the exception of the $set of course ) require that you must have MongoDB 3.6 at least available in order to use this functionality.

In the event you are still running an older version MongoDB than that ( and as of the date of writing, you really should not be since your official support runs out in just 5 months from this date ) then other existing answers on How to Update Multiple Array Elements in mongodb are actually for you.

Great answer! Thanks Neil, this was super informative. Really appreciate you going through $elemMatch and $pull in detail. Case two is what I was looking for, and the other cases were helpful to know what I'm doing. Many thanks! — Dave, Apr 01 '19 at 09:09
@Dave That's okay as `$elemMatch` in particular can be a bit difficult at first to understand where and where not to use it. In your case here you *"technically"* would not *need* it given how `$unset` applies. But as I said in the content, it would be the difference between processing *every document* instead of just those which actually met the same conditions of containing the fields you actually want to update. So *query predicates* are a good thing for almost every case. As an exercise you might want to look at the exact `arrayFilters` to apply in order to only touch elements that match. — Neil Lunn, Apr 01 '19 at 09:17

score 0 · Answer 2 · answered Apr 01 '19 at 08:42

0

I have tried with your given sample document and its working , you have to use $[] to achieve this :

db.collectionName.(update({},{$unset: {
   "scan.$[].$[].arrayToDelete": 1
}})

answered Apr 01 '19 at 08:42

Vikash_Singh

1,856
2
14
27

Remove field found in any mongodb array

2 Answers2

Case 1 - Remove the inner array elements where the Field is present

Case 2 - Just remove the matched field from the inner elements

Case 3 - You Actually wanted to remove "Everything" in the array.

So what are these all doing?

Linked

Related