0

Have a collection (sells by branches) called "operations" with documents like:

{id: 1, branch: "a", amount: 5, type: "sell"}
{id: 2, branch: "a", amount: 9, type: "pass"}
{id: 3, branch: "a", amount: 12, type: "credit"}
{id: 4, branch: "b", amount: 2, type: "pass"}
{id: 5, branch: "b", amount: 22, type: "pass"}
{id: 6, branch: "b", amount: 6, type: "sell"}

We need to get field type where amount is max GROUP BY branch. So, as result we might have

{branch: "a", type: "credit"}
{branch: "b", type: "pass"}
Maxick
  • 68
  • 6

4 Answers4

3

Try this!

const aggregation = [
  { $sort: { amount: -1 } },
  {
    $group: {
      _id: "$branch",
      type: { $first: "$type" },
      branch: { $first: "$branch" },
    },
  },
  { $project: { _id: 0 } },
];
db.test.aggregate(aggregation);
Jannchie
  • 690
  • 6
  • 18
  • 1
    Yes, it works. Cool. But mongo falls on big amount of documents, I'm afraid, because of sorting. So, I searching for another decision. Thanks – Maxick Dec 16 '20 at 11:11
  • 1
    May be you should create an index on "amount" field? – Jannchie Dec 16 '20 at 11:16
1

I have a decision, but it's not suitable

$collection->aggregate([
    [
        '$sort' => [ 'amount' => -1]
    ],
    [
        '$group' => [
            '_id' => '$branch',
            'type' => [ '$first' => '$type'],
        ]
    ]
]);

It works good, but on collections contains less than 300'000 documents

Maxick
  • 68
  • 6
1

You can use aggregate pipeline in this way:

Frist sort documents in descendent order to get the value with greater amount at first position.

Then $group by $branch and slect the first type.

Check this query:

db.collection.aggregate([
  {
    "$sort": {
      "amount": -1
    }
  },
  {
    "$group": {
      "_id": "$branch",
      "type": {
        "$first": "$type"
      }
    }
  }
])

And an example here

J.F.
  • 13,927
  • 9
  • 27
  • 65
  • 1
    Yes, good move, but it fails on half million documents, unfortunately – Maxick Dec 16 '20 at 11:00
  • 1
    I suposse the `$sort` stage is not efficient for 500.000 documents. Try with [this](https://mongoplayground.net/p/_pL0YVzB27L). If it is efficient I'll update my answer to explain. – J.F. Dec 16 '20 at 11:15
  • Cool one. Unfortunately, on 140'000 rows - is OK, but on 500'000 have got "memory exceeded". Think, I should divide raw data into several parts. – Maxick Dec 16 '20 at 12:38
  • When I try to divide into parts it bring not correct result. Operations are staggered over a time, so some points are taken into account several times. – Maxick Dec 16 '20 at 14:50
1

Index is the, IMHO, solution! Thanks, Jannchie.

    $collection->createIndex([
        'amount' => 1,
        'branch' => -1,
        'type' => -1
    ]);

550'000 records processed on the fly And this index size turned out to be more than two times smaller as "_id" index

Maxick
  • 68
  • 6