0

Consider the table fields as follows.

Appid       Client_name          is_real     RTT
100         C1                   1           1
200         C1                   1           6
200         C2                   1           7
100         C1                   1           9
200         C1                   0           7

Now I need total number of unique real Appid's in the table. We can say one appid record is real by if 'is_real' is 1. In above table, we have only 3 real Appid's. Which are (100,C1), (200,C1) and (200, C2).

Postgesql command:

Select sum(r) 
from (select count(is_real) as r from table group by Appid, Client_name) as t;

I don't want any recursive query. If you can fetch with single select query, it would be helpful.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Kranthi
  • 195
  • 3
  • 13

2 Answers2

1

total number of unique real Appid's in the table

I assume is_real is 1 = true, 0 = false.

SELECT COUNT(DISTINCT Appid)
FROM table
WHERE is_real = 1;
Bacon Bits
  • 30,782
  • 5
  • 59
  • 66
  • Thanks for reply. Your answer is helpful to resolve the issue. But as per my requirement, need to add distinct appid and client_name. – Kranthi Nov 20 '14 at 06:25
1

Since you seem to define a unique id by (Appid, Client_name) (which is confusing, since you are mixing terms):

SELECT COUNT(DISTINCT (Appid, Client_name)) AS ct
FROM   tbl
WHERE  is_real = 1;

(Appid, Client_name) is a row-type expression, short for ROW(Appid, Client_name). Only distinct combinations are counted.

Another trick to get this done without subquery is to use a window function:

SELECT DISTINCT count(*) OVER () AS ct
FROM   tbl
WHERE  is_real = 1
GROUP  BY Appid, Client_name;

But neither is going to be faster than using a subquery (which is not a recursive query):

SELECT count(*) AS ct
FROM  (
   SELECT 1
   FROM   tbl
   WHERE  is_real = 1
   GROUP  BY Appid, Client_name
   ) sub;

That's what I would use.

It's essential to understand the sequence of events in a SELECT query:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks for the answer. Among these which is the faster and why? Can you explain briefly ? – Kranthi Nov 20 '14 at 06:34
  • @Kranthi: I would expect the last one to be fastest, closely followed by the first. The 2nd is more complex internally. It really depends on your exact table definition, cardinalities, data distribution, typical use patterns etc. None of this is in your question. Test with `EXPLAIN (ANALYZE, TIMING OFF)`, compare best of 5 or so to exclude caching and one-time effects. If you need top performance, a matching index may help. I'd suggest to start a *new question* with detailed information if you want to explore this in depth. You can always link to this one for reference. – Erwin Brandstetter Nov 20 '14 at 07:02