My goal is to do a sum of the messages_sent and emails_sent per each DISTINCT provider_id value for a given time range (fromDate < stats_date_id < toDate), but without specifying a provider_id. In other words, I need to know about any and all Providers within the specified time range and to sum their messages_sent and emails_sent.
I have a Cassandra table using an express-cassandra schema (in Node.js) as follows:
module.exports = {
fields: {
stats_provider_id: {
type: 'uuid',
default: {
'$db_function': 'uuid()'
}
},
stats_date_id: {
type: 'timeuuid',
default: {
'$db_function': 'now()'
}
},
provider_id: 'uuid',
provider_name: 'text',
messages_sent: 'int',
emails_sent: 'int'
},
key: [
[
'stats_date_id'
],
'created_at'
],
table_name: 'stats_provider',
options: {
timestamps: {
createdAt: 'created_at', // defaults to createdAt
updatedAt: 'updated_at' // defaults to updatedAt
}
}
}
To get it working, I was hoping it'd be as simple as doing the following:
let query = {
stats_date_id: {
'$gt': db.models.minTimeuuid(fromDate),
'$lt': db.models.maxTimeuuid(toDate)
}
};
let selectQueries = [
'provider_name',
'provider_id',
'count(direct_sent) as direct_sent',
'count(messages_sent) as messages_sent',
'count(emails_sent) as emails_sent',
];
// Query stats_provider table
let providerData = await db.models.instance.StatsProvider.findAsync(query, {select: selectQueries});
This, however, complains about needing to filter the results:
Error during find query on DB -> ResponseError: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance
.
I'm guessing you can't have a primary key and do date range searches on it? If so, what is the correct approach to this sort of query?