part of our data is like this...
{
"imei": "LDSJB0005",
"timestamp": "2017-11-11 14:49:26",
"trip_id" : 16,
"data_type": 2,
"longitude": "-",
"latitude": "-",
"loc_wgs84": {
"type": "Point",
"coordinates": [
0,
0
]
}
}
There are 3 million entries like this in db.vehicles and the index is like this.
> db.vehicles.getIndexes()
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "demo.vehicles"
},
{
"v" : 2,
"key" : {
"imei" : 1,
"data_type" : 1,
"trip_id" : 1
},
"name" : "imei_1_data_type_1_trip_id_1",
"ns" : "demo.vehicles"
},
{
"v" : 2,
"key" : {
"loc_wgs84" : "2dsphere"
},
"name" : "loc_wgs84_2dsphere",
"ns" : "demo.vehicles",
"2dsphereIndexVersion" : 3
}
]
Now, we are doing aggregation query like below. Just trying to get every vehicles' imei which has the largest trip_id value(which means it's the vehicles last ride of the day)
[
{$match: {'data_type': 1}},
{$sort: {imei:1, trip_id:-1}},
{$group: {
_id: '$imei',
imei: {$first: '$imei'},
trip_id: {$first: '$trip_id'}
}}
]
The query takes like 10 minutes to complete. Help.