0

I am trying to implement a function that collects unread messages from an articles collection. Each article in the collection has a "discussions" entry with discussion comment subdocuments. An example of such a subdocument is:

{
    "id": NumberLong(7534),
    "user": DBRef("users", ObjectId("...")),
    "dt_create": ISODate("2015-01-26T00:10:44Z"),
    "content": "The discussion comment content"
}

The parent document has the following (partial) structure:

{
    model: {
        id: 17676,
        title: "Article title",
        author: DBRef("users", ObjectId(...)),
        // a bunch of other fields here
    },
    statistics: {
        // Statistics will be stored here (pageviews, etc)
    },
    discussions: [
        // Array of discussion subdocuments, like the one above
    ]
}

Each user also has a last_viewed entry which is a document, an example is as follows:

{
    "17676" : "2015-01-10T00:00:00.000Z",
    "18038" : "2015-01-10T00:00:00.000Z",
    "18242" : "2015-01-20T00:00:00.000Z",
    "18325" : "2015-01-20T00:00:00.000Z"
}

This means that the user has looked at discussion comments for the last time on January 10th 2015 for articles with IDs 17676 and 18038, and on January 20th 2015 for articles with IDs 18242 and 18325.

So I want to collect discussion entries from the article documents, and for article with ID 17676, I want to collect the discussion entries that were created after 2015-01-10, and for article with ID 18242, I want to show the discussion entries created after 2015-01-20.

UPDATED

Based on Neil Lunn's reply, the function I have created so far is:

function getUnreadDiscussions(userid) {
    user = db.users.findOne({ 'model.id': userid });
    last_viewed = [];

    for(var i in user.last_viewed) {
        last_viewed.push({
            'id': parseInt(i),
            'dt': user.last_viewed[i]
        });
    }

    result = db.articles.aggregate([
        // For now, collect just articles the user has written
        { $match: { 'model.author': DBRef('users', user._id) } },
        { $unwind: '$discussions' },
        { $project: {
                'model': '$model',
                'discussions': '$discussions',
                'last_viewed': {
                    '$let': {
                        'vars': { 'last_viewed': last_viewed },
                        'in': {
                            '$setDifference': [
                                { '$map': {
                                    'input': '$$last_viewed',
                                    'as': 'last_viewed',
                                    'in': {
                                        '$cond': [
                                            { '$eq': [ '$$last_viewed.id', '$model.id' ] },
                                            '$$last_viewed.dt',
                                            false
                                        ]
                                    }
                                } },
                                [ false ]
                            ]
                        }
                    }
                }
            }
        },
        // To get a scalar instead of a 1-element array:
        { $unwind: '$last_viewed' },
        // Match only those that were created after last_viewed
        { $match: { 'discussions.dt_create': { $gt: '$last_viewed' } } },
        { $project: { 
            'model.id': 1, 
            'model.title': 1, 
            'discussions': 1, 
            'last_viewed': 1 
        } }
    ]);

    return result.toArray();
}

The whole $let thing, and the $unwind after that, transforms the data into the following partial projection (with the last $match commented out):

{
    "_id" : ObjectId("54d9af1dca71d8054c8d0ee3"),
    "model" : {
        "id" : NumberLong(18325),
        "title" : "Article title"
    },
    "discussions" : {
        "id" : NumberLong(7543),
        "user" : DBRef("users", ObjectId("54d9ae24ca71d8054c8b4567")),
        "dt_create" : ISODate("2015-01-26T00:10:44Z"),
        "content" : "Some comment here"
    },
    "last_viewed" : ISODate("2015-01-20T00:00:00Z")
},
{
    "_id" : ObjectId("54d9af1dca71d8054c8d0ee3"),
    "model" : {
        "id" : NumberLong(18325),
        "title" : "Article title"
    },
    "discussions" : {
        "id" : NumberLong(7554),
        "user" : DBRef("users", ObjectId("54d9ae24ca71d8054c8b4567")),
        "dt_create" : ISODate("2015-01-26T02:03:22Z"),
        "content" : "Another comment here"
    },
    "last_viewed" : ISODate("2015-01-20T00:00:00Z")
}

So far so good here. But the problem now is that the $match to select only the discussions created after the last_viewed date is not working. I am getting an empty array response. However, if I hard-code the date and put in $match: { 'discussions.dt_create': { $gt: ISODate("2015-01-20 00:00:00") } }, it works. But I want it to take it from last_viewed.

Community
  • 1
  • 1
Wouter Thielen
  • 1,016
  • 9
  • 21
  • 1
    The flow goes like this, first the java script in the code is executed, then the aggregation code is passed on to the server. So the evaluation of `last_viewed['$model.id']` takes place before the aggregation code is executed. The `last_viewed` variable does not have a property `$model.id` resulting in an undefined value. Hence that is the incorrect way. – BatScream Feb 12 '15 at 10:25
  • Thanks @BatScream. Yes, I found this: http://stackoverflow.com/questions/24335983/how-to-use-a-javascript-object-inside-mongodb-aggregation-pipeline and it mentions that the external variable would indeed not be available within the aggregation process. I am looking at Neil's answer there, wondering if I can reformat my `last_viewed` object like that and see if I can get it to work using `$let`. But I am unsure how the `$let` block works. – Wouter Thielen Feb 12 '15 at 10:29
  • 1
    You will not be able to dynamically reference the value of a field like you are trying to do. It'd be nice if you could post a full example of the parent document of the discussion entries, because then I could be more specific about how things ought to be structured, but I think the best path forward would be to have discussion entries be separate documents. You'll probably need to use one query per article to get the unread discussions. – wdberkeley Feb 12 '15 at 20:10
  • @wdberkeley Indeed, I am now aware of not being able to dynamically reference to a variable during the aggregate process. So I have updated the question with an example of the parent document, and with what I have done so far, i.e. implemented Neil Lunn's solution, and ran into a different problem: the last `$match` operation is not working here. From the projection I have at the end, it seems to me that it should work... Any ideas? – Wouter Thielen Feb 13 '15 at 03:00

1 Answers1

0

I found another SO thread where this issue has been resolved by using the $cmp operator.

The final part of the aggregation would be:

[
    { /* $match, $unwind, $project, $unwind as before */ },
    { $project: {
        'model': 1,
        'discussions': 1,
        'last_viewed': 1,
        'compare': {
            $cmp: [ '$discussions.dt_create', '$last_viewed' ]
        }
    } },
    { $match: { 'compare': { $gt: 0 } } }
]

The aggregation framework is great, but it takes quite a different approach in problem-solving. Hope this helps anyone!

I'll keep the question unanswered in case anyone else has a better answer/method. If this answer has been upvoted enough times, I'll accept this one.

Community
  • 1
  • 1
Wouter Thielen
  • 1,016
  • 9
  • 21