1

I need to be able to create visualizations that display the difference between consecutive measurements.

I have MongoDB Charts connected to a MongoDB instance for visualizing data. The problem is AFAIK MongoDB Charts can only display data from queries or aggregation pipelines. So far I have no luck getting the below task done in aggregation pipelines. I'm starting to wonder if I've chosen the wrong tools for this job.

My data looks something like this after filtering and sorting it:

{
    "run_id": "run_x",
    "measurements": {
        "true_positives": 5,
        "false_positives": 1
    }
},
{
    "run_id": "run_y",
    "measurements": {
        "true_positives": 6,
        "false_positives": 0
    }
}

And I need to calculate deltas between runs so that I end up with:

{
    "run_id": "run_x",
    "measurements": {
        "true_positives": 5,
        "tp_delta": 0
        "false_positives": 1,
        "fp_delta": 0
    }
},
{
    "run_id": "run_y",
    "measurements": {
        "true_positives": 6,
        "tp_delta": 1
        "false_positives": 0,
        "fp_delta": -1
    }
}

I have been able to calculate the deltas to the first or last measurement, but that's not enough. I specifically need consecutive deltas. Calculating deltas with a programming language would, of course, be trivial, but I'd need a way to feed the results to MongoDB Charts. MapReduce would work, except I don't think it works with MongoDB Charts. Saving intermediate results with deltas to the database isn't feasible either since the filtering conditions change.

Is there a way to achieve what I need with aggregation pipelines or should I look at other options? I'm pretty close to ditching MongoDB Charts at this point.

1 Answers1

0

Well you may apply a trick solution. If we cross with the same collection and apply correct operators, we can get previous / next values for each document.

I assume run_id is unique, otherwise we need to use _id field.

Pseudocode

Given                => 1 2 3 4 5
We order descendant. => 5 4 3 2 1 //It will allow us to get "previous value"    
//Note: If we order ascending, it will allow us to get "next value"

for (var i=0; i<size(collection); i++)
    for (var j=0; j<size(collection); j++)
        // i=0, j=0 5 < 5 - false
        // i=0, j=1 5 < 4 - true
        // i=0, j=2 5 < 3 - true
        // ...
        if (collection[j].run_id < collection[i].run_id)
            next_measure.add(collection[j])
    //for j
    we leave only first item for next_measure
//for i

// Now, we substract measure values
// Order back ascending

db.collection.aggregate([
  {
    $sort: {
      run_id: -1
    }
  },
  {
    $lookup: {
      from: "collection",
      let: {
        prev_value: "$$ROOT"
      },
      pipeline: [
        {
          $sort: {
            run_id: -1
          }
        },
        {
          $match: {
            $expr: {
              $lt: [
                "$run_id",
                "$$prev_value.run_id"
              ]
            }
          }
        },
        {
          $limit: 1
        }
      ],
      as: "next_measure"
    }
  },
  {
    $addFields: {
      tmp: {
        $arrayElemAt: [
          "$next_measure",
          0
        ]
      }
    }
  },
  {
    $project: {
      _id: 0,
      run_id: 1,
      measurements: {
        true_positives: 1,
        tp_delta: {
          $ifNull: [
            {
              $subtract: [
                "$measurements.true_positives",
                "$tmp.measurements.true_positives"
              ]
            },
            0
          ]
        },
        false_positives: 1,
        fp_delta: {
          $ifNull: [
            {
              $subtract: [
                "$measurements.false_positives",
                "$tmp.measurements.false_positives"
              ]
            },
            0
          ]
        }
      }
    }
  },
  {
    $sort: {
      run_id: 1
    }
  }
])

MongoPlayground

Valijon
  • 12,667
  • 4
  • 34
  • 67
  • This does answer my question and does work. So I'll mark it as the solution. It's not great with MongoDB Charts though, because the only way to use $lookup with it is to define a new data source with a custom aggregation pipeline. It works but doesn't scale to usage where you have a large amount of charts. – Mikko Mattila Jan 15 '20 at 11:54
  • You can create [View](https://docs.mongodb.com/manual/core/views/) which runs this pipeline in real-time. For scaling, you need to store them periodically into separate collection and use for analytics, charts, etc.. – Valijon Jan 15 '20 at 12:14