Get multiple counts with one Cosmos DB query?

Question

Consider these queries:

SELECT COUNT(1) AS failures 
FROM c 
WHERE c.time = 1623332779 AND c.status = 'FAILURE'
   
SELECT COUNT(1) AS successes 
FROM c 
WHERE c.time = 1623332779 AND c.status = 'SUCCESS'

How can I combine these two distinct queries into one query?

I tried repurposing the answers from How to get multiple counts with one SQL query?, but ran into a few problems:

COUNT(*) throws an error "Syntax error, incorrect syntax near '*'."
UNION throws "Syntax error, incorrect syntax near 'UNION'."

I also experimented with

SELECT 
    SUM(CASE WHEN c.time = 1623332779 THEN 1 else 0 end) 
FROM c

but this leads to another syntax error. I noticed that

SELECT COUNT(1) AS mycounter, COUNT(1) AS mycounter2 
FROM c 
WHERE c.time = 1623332779

returns

[
    {
        "mycounter": 3,
        "mycounter2": 3
    }
]

but I was unable to link these distinct counters to distinct queries.

score 3 · Answer 1 · answered Jul 23 '21 at 12:08

3

Instead of doing counts of the overall query, you can use GROUP BY to get counts in a single query. For example:

SELECT c.time, c.status, COUNT(c.status) AS statuscount
FROM c
WHERE c.time = "1623332779"
GROUP BY c.time, c.status

This won't give you explicit counts called "successes" and "failures" but it will return both counts, something like:

[
  {
    "time": "1623332779",
    "status": "FAILURE",
    "statuscount": 123
  },
  {
    "time": "1623332779",
    "status": "SUCCESS",
    "statuscount": 456
  }
]

answered Jul 23 '21 at 12:08

David Makogon

69,407
21
141
189

This works, but it uses far more RUs than doing two separate queries, which makes sense given 404's statement that "there's no proper way to consolidate multiple queries." – jskattt797 Jul 23 '21 at 17:54
@jskattt797 - have you benchmarked this query against your own data? This query uses indexes; it's not a collection-scan. It's quite possible it uses more RU, but... you should benchmark at least, as this query is not performing independent, separate counts. – David Makogon Jul 23 '21 at 18:21
I tested this query on my data and in one case it used approximately 178 RUs, whereas the individual SELECT COUNT(1) ... queries used 3 RUs each. User 404's solution used ~170 RUs. I should note that I queried for c.time within a range of possible times rather than equal to one particular time. Is this performance surprising? Perhaps I should do more tests. – jskattt797 Jul 24 '21 at 03:58
Regarding RU/s this single query. In my case it was 26Ru/s where two separate 2x21Ru. – Leszek P Jun 02 '23 at 13:46

NotFound · Accepted Answer · 2021-07-22T09:00:25.467

2

The following should work. The count operator skips values that are undefined which allows you to filter out rows from it:

SELECT 
    COUNT(c.status = 'SUCCESS' ? 1 : undefined) AS successes, 
    COUNT(c.status = 'FAILURE' ? 1 : undefined) AS failures
FROM c
WHERE c.time = 1623332779

It ruins performance though as it doesn't use indexing at all for the count. So you're better off using two seperate queries. If you really want to use a single request you could create a stored procedure that runs both queries and pastes the results together.

edited Jul 22 '21 at 09:00

answered Jul 22 '21 at 07:30

NotFound

5,005
2
13
33

So as far as you know, there is no way to consolidate multiple queries while retaining good performance? – jskattt797 Jul 22 '21 at 16:28
And what exactly do you mean by "performance"? I don't understand how performing two separate queries could possibly require *fewer* RUs than performing the same two actions in one query. – jskattt797 Jul 23 '21 at 00:21
1

@jskattt797 When you use the method from my post the filters within the count statement do not use the indexer even if the columns are indexed. So it'll move through every row individually to determine whether it should be counted or not. Whereas your original queries are optimized and can be efficiently be retrieved without the amount of rows mattering a whole lot. As example; if I would have 10,000 rows with a `c.time = 1623332779` it would result in ~2000RU. Whereas the two queries are both <10 RU. – NotFound Jul 23 '21 at 07:19
1

@jskattt797 No; There's no proper way to consolidate multiple queries, but it's also not hugely important since Cosmos can handle concurrency really well. So you could call both queries in parallel. – NotFound Jul 23 '21 at 07:21

Get multiple counts with one Cosmos DB query?

2 Answers2