13

I am trying to use the following query to see if I have duplicates in the DB

SELECT c.VariantNo, count(1) AS jongel FROM c where c.brand = 'XXXX' AND c.Consumer = 'XXX_V2' GROUP BY c.VariantNo HAVING jongel > 1

But I am getting a syntax error close to HAVING

How can I found if I have more than one document with the same VariantNo?

Matt Douhan
  • 2,053
  • 1
  • 21
  • 40

2 Answers2

21

Per my experience,HAVING is not supported.

AS a workaround,you can use this sql:

SELECT d.VariantNo,d.jongel from (Select COUNT(1) AS jongel,c.VariantNo from c where c.brand ='cx' and c.Consumer = 'gx' group by c.VariantNo) d where d.jongel > 1

Below is my test data:

[ { "id": "1", "VariantNo": 2, "brand": "cx", "Consumer": "gx" }, { "id": "2", "VariantNo": 3, "brand": "cx", "Consumer": "gx" }, { "id": "3", "VariantNo": 2, "brand": "cx", "Consumer": "gx" }, { "id": "4", "VariantNo": 3, "brand": "cx", "Consumer": "gx" }, { "id": "5", "VariantNo": 6, "brand": "cx", "Consumer": "gx" }
]

Here is the output:

[ { "VariantNo": 2, "jongel": 2 }, { "VariantNo": 3, "jongel": 2 } ]

Steve Johnson
  • 8,057
  • 1
  • 6
  • 17
  • This works, the only downside is the huge performance toll on large datasets – Juan Javier Triff Cabanas Aug 09 '22 at 14:57
  • actually this doesn't seem to always work, at least I have examples of it not working. My theory is that at least some of the grouping is done client-side in the query tool, probably due to data partitioning, so pushing the having clause to the server results in rows being omitted because they didn't meet the criteria BEFORE the grouping was fully calculated – Richard Hauer Dec 12 '22 at 11:25
1

Cosmos DB doesn't support non-correlated subqueries