14

How to achieve count distinct function on elastic search type using sql4es driver?

Select distinct inv_number , count(1) from invoices;

But it returns the total count of the particular invoice number.

PravinKumar.R
  • 335
  • 1
  • 3
  • 15

5 Answers5

12
  {
      "size": 0, 
      "aggs": {
        "total_invoices": {
          "terms": {
            "field": "inv_number" 

        },
        "aggs": {
          "unique_invoiceid": {
            "cardinality": {
              "field": "inv_number"
            }
          }
        }
      }
    }

This will give you the invoice number as key and distict value in unique_invoiceid

prasad kp
  • 833
  • 1
  • 10
  • 32
7

Elasticsearch doesn't support deterministic DISTINCT counts (source). It supports only approximate distinct counters like "cardinality". One way to count distincts in a deterministic way is to aggregate them using "terms" and count buckets from result.

slm
  • 15,396
  • 12
  • 109
  • 124
asu
  • 539
  • 6
  • 15
4

Since, the OP is using sql4es jdbc driver, he is asking for a sql query for his use-case :

SELECT COUNT(DISTINCT inv_number) from invoices;

it returns the number of distinct values of the specified column

Rahul
  • 15,979
  • 4
  • 42
  • 63
0

This should work to count exact distinct values:

curl -X POST "localhost:9200/invoices/_search?size=0&pretty" -H 'Content-Type: application/json' -d '{
"aggs" : {
    "types_count" : {
      "value_count" : { "field" : "inv_number" }
    },
    "group_by_status": {
      "terms": {
        "field": "inv_number"
      }
    }
}

}'

user582175
  • 954
  • 12
  • 17
0

Cardinality works if your dataset uniqueness/fragmentation is less than 2000. If you really care for accuracy instead of retrieval speed and memory utilization then you can go for scripted metric which will give you accurate results.

https://pratik-patil.medium.com/accurate-distinct-count-and-values-from-elasticsearch-dadce51ad4a7

buddemat
  • 4,552
  • 14
  • 29
  • 49
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Oct 12 '21 at 07:42