I have a collection in mongo which stores clickstream data for each day It has a structure like -
{"utcDate": ISODate(date1), "userToken":"user-id1", ..}
{"utcDate": ISODate(date1), "userToken":"user-id2", ..}
{"utcDate": ISODate(date2), "userToken":"user-id1", ..}
{"utcDate": ISODate(date2), "userToken":"user-id2", ..}
I am trying to get daily active users, within a certain date range. This is my current query -
[
{
"$project": {
"utcDate~~~day": {
"$let": {
"vars": {
"column": "$utcDate"
},
"in": {
"___date": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$$column"
}
}
}
}
},
"userToken": "$userToken"
}
},
{
"$match": {
"utcDate~~~day": {
"$gte": {
"___date": "2019-04-01"
},
"$lte": {
"___date": "2019-04-08"
}
}
}
},
{
"$project": {
"_id": "$_id",
"___group": {
"utcDate~~~day": "$utcDate~~~day"
},
"userToken": "$userToken"
}
},
{
"$group": {
"_id": "$___group",
"count": {
"$addToSet": "$userToken"
}
}
},
{
"$sort": {
"_id": 1
}
},
{
"$project": {
"_id": false,
"utcDate~~~day": "$_id.utcDate~~~day",
"count": {
"$size": "$count"
}
}
},
{
"$sort": {
"utcDate~~~day": -1
}
}
]
How do I optimize this query?
I currently have an index on utcDate
and userToken
, I read that compound indexes help in this, what should my index look like?
These are my current indexes -
[
{
"v" : 2,
"key" : {
"userToken" : 1
},
"name" : "userToken_1",
"ns" : "events.user_events"
},
{
"v" : 2,
"key" : {
"utcDate" : 1
},
"name" : "utcDate_1",
"ns" : "events.user_events"
}
]