2

In a game using PostgreSQL 9.3.10 some players have paid for a "VIP status", which is indicated by vip column containing a date from future:

# \d pref_users

   Column   |            Type             |     Modifiers      
------------+-----------------------------+--------------------
 id         | character varying(32)       | not null
 first_name | character varying(64)       | not null
 last_name  | character varying(64)       | 
 vip        | timestamp without time zone | 

Also players can rate other players by setting nice column to true, false or leaving it at null:

 # \d pref_rep

  Column   |            Type             |                         Modifiers                         
-----------+-----------------------------+-----------------------------------------------------------
 id        | character varying(32)       | not null
 author    | character varying(32)       | not null
 nice      | boolean                     | 

I calculate a "reputation" of VIP-players by issuing this SQL JOIN statement:

# select u.id, u.first_name, u.last_name, 
  count(nullif(r.nice, false))-count(nullif(r.nice, true)) as rep 
  from pref_users u, pref_rep r 
  where u.vip>now()and u.id=r.id group by u.id order by rep asc;


           id            |           first_name           | last_name | rep  
-------------------------+--------------------------------+--------------------
 OK413274501330          | ali                            | salimov   | -193
 OK357353924092          | viktor                         | litovka   | -137
 DE20287                 | sergej warapow                 |              

My question is please the following:

How to find all negatively rated players, who have rated other players?

(The background is that I have added a possibility to rate others - to all VIP-players. Until that only positively rated players could rate others).

I have tried the following, but get the error below:

# select count(*) from pref_rep r, pref_users u 
where r.author = u.id and u.vip > now() and 
u.id in (select id from pref_rep 
where (count(nullif(nice, false)) -count(nullif(nice, true))) < 0);

ERROR:  aggregate functions are not allowed in WHERE
LINE 1: ...now() and u.id in (select id from pref_rep where (count(null...
                                                             ^

UPDATE:

I am trying it with temporary table now -

First I fill it with all negatively rated VIP-users and this works well:

# create temp table my_temp as select u.id, u.first_name, u.last_name,
  count(nullif(r.nice, false))-count(nullif(r.nice, true)) as rep 
  from pref_users u, pref_rep r 
  where u.vip>now() and u.id=r.id group by u.id;

 SELECT 362

But then my SQL JOIN returns too many identical rows and I can not find what condition is missing there:

 # select u.id, u.first_name, u.last_name 
   from pref_rep r, pref_users u, my_temp t 
   where r.author=u.id and u.vip>now() 
   and u.id=t.id and t.rep<0;

           id            |           first_name           |         last_name          
-------------------------+--------------------------------+----------------------------
 OK400153108439          | Vladimir                       | Pelix
 OK123283032465          | Edik                           | Lehtik
 OK123283032465          | Edik                           | Lehtik
 OK123283032465          | Edik                           | Lehtik
 OK123283032465          | Edik                           | Lehtik
 OK123283032465          | Edik                           | Lehtik
 OK123283032465          | Edik                           | Lehtik

Same problem (multiple rows with same data) I get for the statement:

# select u.id, u.first_name, u.last_name 
  from pref_rep r, pref_users u 
  where r.author = u.id and u.vip>now() 
  and u.id in (select id from my_temp where rep < 0);

I wonder what condition could be missing here?

Alexander Farber
  • 21,519
  • 75
  • 241
  • 416
  • 7
    You have to put aggregate functions in the HAVING clause, you can't put them in the WHERE clause. – Rabbit Dec 17 '15 at 17:21
  • Trying `select id from pref_rep having (count(nullif(nice, false)) -count(nullif(nice, true))) < 0;`gives me unfortunately: `ERROR: column "pref_rep.id" must appear in the GROUP BY clause or be used in an aggregate function` – Alexander Farber Dec 17 '15 at 18:42
  • 1
    When using aggregates, everything in the SELECT clause must either have an aggregate function on it or be in the GROUP BY clause. You might want to invest some time in a SQL tutorial. – Rabbit Dec 17 '15 at 19:06

2 Answers2

2

First of all, I would write your first query as this:

select
  u.id, u.first_name, u.last_name,
  sum(case
        when r.nice=true then 1
        when r.nice=false then -1
      end) as rep 
from
  pref_users u inner join pref_rep r on u.id=r.id 
where
  u.vip>now()
group by
  u.id, u.first_name, u.last_name;

(it's the same as yours, but I find it clearer).

To find negatively rated players, you can use the same query as before, just adding HAVING clause:

having
  sum(case
        when r.nice=true then 1
        when r.nice=false then -1
      end)<0

to find negatively rated players who have rated players, one solution is this:

select
  s.id, s.first_name, s.last_name, s.rep
from (
  select
    u.id, u.first_name, u.last_name,
    sum(case
          when r.nice=true then 1
          when r.nice=false then -1
        end) as rep 
  from
    pref_users u inner join pref_rep r on u.id=r.id 
  where
    u.vip>now()
  group by
    u.id, u.first_name, u.last_name
  having
    sum(case
          when r.nice=true then 1
          when r.nice=false then -1
        end)<0
  ) s
where
  exists (select * from pref_rep p where p.author = s.id)

eventually the having clause can be removed from the inner query, and you just can use this where clause on the outer query:

where
  rep<0
  and exists (select * from pref_rep p where p.author = s.id)
Alexander Farber
  • 21,519
  • 75
  • 241
  • 416
fthiella
  • 48,073
  • 15
  • 90
  • 106
2

You forgot to mention that pref_users.id is defined as PRIMARY KEY - else your first query would not work. It also means that id is already indexed.

The best query largely depends on typical data distribution.

Assuming that:

  • ... most users don't get any negative ratings.
  • ... most users don't vote at all.
  • ... some or many of those who vote do it often.

It would pay to identify the few possible candidates and only calculate the total rating for those to arrive at the final selection - instead of calculating the total for every user and then filtering only few.

SELECT *
FROM  (  -- filter candidates in a subquery
   SELECT *
   FROM   pref_users u
   WHERE  u.vip > now()
   AND    EXISTS (
      SELECT 1
      FROM   pref_rep
      WHERE  author = u.id  -- at least one rating given
      )
   AND    EXISTS (
      SELECT 1
      FROM   pref_rep
      WHERE  id = u.id 
      AND    NOT nice  -- at least one neg. rating received
      )
   ) u
JOIN   LATERAL (  -- calculate total only for identified candidates
   SELECT sum(CASE nice WHEN true THEN 1 WHEN false THEN -1 END) AS rep 
   FROM   pref_rep
   WHERE  id = u.id
   ) r ON r.rep < 0;

Indexes

Obviously, you need an index on pref_rep.author besides the (also assumed!) PRIMARY KEY indexes on both id columns.

If your tables are big some more advanced indexes will pay.

For one, you only seem to be interested in current VIP users (u.vip > now()). A plain index on vip would go a long way. Or even a partial multicolumn index that includes the id and truncates older tuples from the index:

CREATE INDEX pref_users_index_name ON pref_users (vip, id)
WHERE vip > '2015-04-21 18:00';

Consider details:

If (and only if) negative votes are a minority, a partial index on pref_rep might also pay:

CREATE INDEX pref_rep_downvote_idx ON pref_rep (id)
WHERE NOT nice;

Test performance with EXPLAIN ANALYZE, repeat a couple of time to rule out caching effects.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228