2

It might a long shot, but here we go. Let's suppose I have the following document definition:

public class Test1 {
    public ObjectId Id {get;set;}
    public int NonUniqueId { get; set; }
    public string Name { get; set; }
    public decimal Price { get; set; }
}

Using IMongoCollection<Test1>, I can filter the results to documents the ones where Name = "somevalue", sort by Price and return 10 rows only.

var builder = Builders.Filters;
var filter = builder.Where(x=>x.Name == "somevalue");

var result = await collection.Find(filter).SortBy(x=>x.Price).Limit(10).ToListAsync();

Now to the question at hand. If more than one document is returned by the filter (with Name = "somevalue") and NonUniqueId occurs more than once in that set, I would like to exclude duplicates NonUniqueId and return just the one with the lowest Price.

This logic is quite easy to implement in the code: fetch 10 results, if any of them are 'duplicates' (i.e. NonUniqueId occurs more than once), keep the one with the smallest price and run another search excluding this NonUniqueId. However, it requires multiple DB calls which is not great (up to 10 in the worst case).

In MongoDB (with C# driver), is there a way to ignore these 'duplicate' results based on the property and, ideally, return just the min of another property (or anything to the same effect)?

Dawid O
  • 6,091
  • 7
  • 28
  • 36

2 Answers2

2

give this pipeline a try:

db.Product.aggregate([
    {
      $match: { Name: "book" }
    },
    {
      $sort: { Price: 1 }
    },
    {
      $group: {
        _id: "$NonUniqueId",
        product: { $first: "$$ROOT" }
      }
    },
    {
      $replaceWith: "$product"
    }
])

https://mongoplayground.net/p/sqSjiXQhdlF

update: added c#

c# version is a bit inconvenient due to the projection needed:

var result = collection.AsQueryable()
               .Where(p => p.Name == "book")
               .OrderBy(p => p.Price)
               .GroupBy(p => p.NonUniqueId)
               .Select(g => new Product
               {
                   Id = g.First().Id,
                   Name = g.First().Name,
                   NonUniqueId = g.First().NonUniqueId,
                   Price = g.First().Price
               })
               .ToList();

test program:

using MongoDB.Driver;
using MongoDB.Driver.Linq;
using MongoDB.Entities;
using MongoDB.Entities.Core;
using System.Linq;

namespace StackOverFlow
{
    public class Product : Entity
    {
        public int NonUniqueId { get; set; }
        public string Name { get; set; }
        public decimal Price { get; set; }
    }

    public static class Program
    {
        private static void Main()
        {
            new DB("test");

            var result = DB.Queryable<Product>()
                           .Where(p => p.Name == "book")
                           .OrderBy(p => p.Price)
                           .GroupBy(p => p.NonUniqueId)
                           .Select(g => new Product
                           {
                               ID = g.First().ID,
                               Name = g.First().Name,
                               NonUniqueId = g.First().NonUniqueId,
                               Price = g.First().Price
                           })
                           .ToList();
        }
    }
}
Dĵ ΝιΓΞΗΛψΚ
  • 5,068
  • 3
  • 13
  • 26
  • Yeah... The projection will be the issue here. The collection has got 100000s of documents. I don't particularly fancy downloading all of them, but you can run a raw query from the driver. Does the limit shot circuit the group by? – Dawid O Feb 25 '20 at 16:17
  • @Dawid you can limit without an issue. the limit is imposed on the cursor afaik. – Dĵ ΝιΓΞΗΛψΚ Feb 25 '20 at 16:52
  • @DawidO thought you might be interested in a better way to run raw queries using c#. have a look at [this library](https://github.com/dj-nitehawk/MongoDB.Entities/wiki/13.-String-Templates) i've written. – Dĵ ΝιΓΞΗΛψΚ Feb 26 '20 at 10:05
  • Perhaps I am missing something obvious here, I can't do `collection.AsQueryable()` because I don't want to download the entire collection on my local machine. With regards to the library, could you provide more detail how to integrate it with the code sample I provided? – Dawid O Feb 29 '20 at 11:21
  • @DawidO here you go: https://gist.github.com/dj-nitehawk/bda542ca93609fe12a670ccab89dc1d2 – Dĵ ΝιΓΞΗΛψΚ Feb 29 '20 at 13:22
-1

Try the Distinct method, as shown in this answer. You can even define a custom Equals function (by defining a class that implements IEqualityComparer) that performs the checks you need to perform

leqo
  • 356
  • 4
  • 15