I have a field on my user documents called _badges
which are ObjectIds
. I am attempting to remove duplicate values in the array and have been successful using async iterators via Mongoose script but that is a bit too slow for my liking:
async function removeDuplicateBadges() {
// Use async iterator to remove duplicate badge ids
for await(const userDoc of User.find()) {
console.log(userDoc.displayName)
const badgesStringArray = userDoc._badges.map(badge => {
if(badge === null) return
else return badge.toString()
})
const uniqueBadgesArray = [...new Set(badgesStringArray)]
await User.findByIdAndUpdate(
userDoc._id,
{
_badges: uniqueBadgesArray
}
)
}
}
I tried doing the same using the following aggregation command but that did not seem to remove the duplicate values on the actual documents stored in the database.
It only returns results as the aggregate framework is meant to query and transform not mutate the underlying data source:
db.getCollection("users").aggregate(
[
{
"$unwind" : {
"path" : "$_badges"
}
},
{
"$group" : {
"_id" : "$_id",
"_badges" : {
"$addToSet" : "$_badges"
}
}
}
])
Any hints on effective ways to remove duplicate values would be appreciated that are better time efficiency than using the async iterator methodology above.