0

Following the guidance here, I am using Mongo 4.4 to try and find all documents where field_A is null and fill field_A with the value of field_B.

I tried:

db.collection.updateMany({field_A:{$nin:[null,""]}},[{"$set": {field_A : "$field_B" }}])

This returns:

{ "acknowledged" : true, "matchedCount" : 176394, "modifiedCount" : 9952 }

I don't understand why most of them aren't modifying . . .?

Despite the warnings against it, I also tried running this without the square brackets in the second half of the query:

db.collection.updateMany({field_A:{$nin:[null,""]}},{"$set": {field_A : "$field_B" }})

But that just replaces field_A with the literal phrase "$field_B".

Can anyone explain what I'm doing wrong? Thanks!

UPDATE: Per @prasad_'s suggestion, I tried:

db.collection.updateMany({field_A:null},[{"$set": {field_A : "$field_B" }}])

This worked fine. All matched documents updated.

However, I still have documents where field_A is not truly null but rather "". I tried:

db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])

And I got:

{ "acknowledged" : true, "matchedCount" : 2815, "modifiedCount" : 0 }

I just can't figure out why the "" won't update to the value of field_B.

Wernfried Domscheit
  • 54,457
  • 9
  • 76
  • 110
jbat
  • 33
  • 6
  • See https://stackoverflow.com/questions/68255363/or-with-if-and-in-mongodb/68255564#68255564 – Wernfried Domscheit Dec 06 '22 at 20:40
  • Two questions. 1) If you're trying to find all documents where `field_A` is `null`, then why are you filtering using `$nin`? 2) When a document matches but is not modified, that suggests that the `update` would not result in any changes to the document. Do most of the documents that match the query predicate already have the same values for `field_A` and `field_B`? One additional thing - the meaning of square brackets is something I outline [here](https://stackoverflow.com/questions/74078046/subdocument-merge-issue-when-mongodb-rust-driver-is-used/74080218#74080218) – user20042973 Dec 06 '22 at 21:08
  • @user20042973 1) I found that just searching only for null in field_A somehow didn't get all of the documents where field_A was actually null. So maybe they have blank spaces in them or something rather than being truly nullI(?). 2) No, by definition I'm just trying to find the documents where field_A is empty and fill it with field_B. The matched count of 176394 is just a small subset of the total documents, the rest of which have content in field_A. – jbat Dec 06 '22 at 22:20
  • @WernfriedDomscheit I saw this answer, but as far as I can tell it just fills in field_A with a static phrase, "missing," when I need it to fill in from field_B in the relevant document. Please let me know if I misunderstood what it says! – jbat Dec 06 '22 at 22:24
  • `{field_A:{$nin:\[null,""\]}},\[{"$set": {field_A : "$field_B" }}\]` - The update query requires a _condition_ and an _update_. The condition would be `{ filedA: null }`. Further, you can use the Updates With Aggregation Pipeline feature available with MongoDB version 4.2 or higher. – prasad_ Dec 07 '22 at 02:02
  • Please show us some sample data. What is the purpose of these backslashes "\" ? You say "fill blank field" and "where field_A is null" (which is not the same!) but your condition is `{field_A: {$nin: [null,""]}}` - which is just the opposite. `$nin` means "**not** in" – Wernfried Domscheit Dec 07 '22 at 06:11
  • @WernfriedDomscheit SO added the backslashes, they weren't in the original query, I edited my post to try and remove them, hopefully they stay removed. – jbat Dec 07 '22 at 13:56
  • @prasad_ This is very helpful, thanks, I clearly have misunderstood the $nin here. However, I'm still having problems getting all the fields to update even using your suggested code. I have updated the question to reflect what I've tried and how it's still not working. – jbat Dec 07 '22 at 14:08
  • Did you try the opposite? `{ $in: [null,""] }` – Wernfried Domscheit Dec 07 '22 at 14:12

1 Answers1

1

There was a lot of discussion in the comments that helped clear up the confusion between the $nin and $in operators, the latter seemingly being what is desired in this situation. There's a separate piece of confusion about matching versus modifying that I mentioned here and will expand on in this answer.

I just can't figure out why the "" won't update to the value of field_B.

In short, the value of field_B appears to be "" for those 2815 matching documents. Since that is also already the value of field_A there is nothing for the database to change.

The output you got from the database is:

{ "acknowledged" : true, "matchedCount" : 2815, "modifiedCount" : 0 }

Referring to the documentation, the matchedCount field represents the following:

The number of documents selected for update. If the update operation results in no change to the document, e.g. $set expression updates the value to the current value, nMatched can be greater than nModified.

And the modifiedCount field is:

The number of existing documents updated. If the update/replacement operation results in no change to the document, such as setting the value of the field to its current value, nModified can be less than nMatched.

Both of those definitions already state that the database will avoid modifying the document (and incrementing the counter) if the update would not change the document from its current state.

Consider a collection with the following three documents:

> db.collection.find()
[
  { _id: 1, field_A: 123, field_B: 456 },
  { _id: 2, field_A: '', field_B: 'ABC' },
  { _id: 3, field_A: '', field_B: '' }
]

When we execute your update (in the mongosh shell), we receive the following response:

> db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 2,
  modifiedCount: 1,
  upsertedCount: 0
}

Here we have matchedCount: 2. There were 2 documents that matched the query predicates of {field_A: ""}. That would be the documents with _id:2 and _id:3.

But we only have modifiedCount: 1. Only one of the documents (id:2) needed to change. The other document (id:3) already had the same value for field_A and field_B (an empty string) so applying the update wouldn't change the document. After the operation completes, the documents now look as follows with the second one having been modified:

> db.collection.find()
[
  { _id: 1, field_A: 123, field_B: 456 },
  { _id: 2, field_A: 'ABC', field_B: 'ABC' },
  { _id: 3, field_A: '', field_B: '' }
]

If we run the update again, we get:

> db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 0,
  upsertedCount: 0
}

This time the second document no longer matched (since we updated field_A to no longer be an empty string). However, the third document still matches but still does not get modified because field_A and field_B are both empty strings. Here are some related commands that help confirm that:

> //Count the number of documents where both `field_A` and `field_B` are empty strings:
> db.collection.countDocuments({field_A:'', field_B:''})
1
> //Change the filter of the update to exclude documents where `field_B` is an empty string:
> db.collection.updateMany({field_A:"", field_B:{$ne:""}},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 0,
  modifiedCount: 0,
  upsertedCount: 0
}

I would expect the count above to return 2815 in your environment. If you do not expect field_B to contain an empty string for those documents then you may need to take additional actions to correct that.

user20042973
  • 4,096
  • 2
  • 3
  • 14
  • this was EXACTLY my problem. My data is extremely dense and hard to quickly look at and see all the fields, so I wasn't catching this. Given that I completely messed up $nin and $in, I just assumed I was continuing to have problems understanding MongoDB syntax, rather than problems with the quality of the underlying data. Many thanks -- and thanks to everyone who jumped in to help previously! – jbat Dec 07 '22 at 15:26