1

I have parent posts table and child votes with posts.id and votes.post_id relation. I want to count average rating for each post but for only latest given 50 votes. I know how to do that for all votes:

SELECT T1.`title`, (
    SELECT AVG(`vote`)
    FROM `votes`
    WHERE `votes`.`post_id` = T1.`id`
) AS `average`
FROM `posts` T1
GROUP BY T1.`id`

I know this possible to do with subquery:

SELECT T1.`title`, (
    SELECT AVG(`vote`)
    FROM (
        SELECT `vote` FROM `votes`
        WHERE `votes`.`post_id` = T1.`id`
        ORDER BY `votes`.`id` DESC
        LIMIT 10
    ) AS T2
) AS `average`
FROM `posts` T1
GROUP BY T1.`id`

But there is error: Error in query (1054): Unknown column 'T1.id' in 'where clause'. T1 alias is not accessible in subsubquery. Any ideas?

http://sqlfiddle.com/#!9/fb9341/2

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Artem P
  • 5,198
  • 5
  • 40
  • 44

1 Answers1

1

Okay, you want to pick the rows that are in the most recent 50 rows per post. There are numerous answers on Stack Overflow for this type of query, mostly under the or tags. Example: How to SELECT the newest four items per category?

Once you write that query, you could put inside a subquery like you already know how to write, to get the AVG(vote) per post.


Re your comments:

This is what I mean:

SELECT T1.title, AVG(V.vote) AS avg_vote 
FROM posts T1 
JOIN (
   SELECT v1.id, v1.post_id, v1.vote
   FROM votes v1
   LEFT OUTER JOIN votes v2 ON v1.post_id = v2.post_id and v1.id < v2.id
   GROUP BY v1.id
   HAVING COUNT(*) < 10
) AS V ON T1.id = V.post_id
GROUP BY T1.id;

Output given the data in your SQLFiddle:

+---------+----------+
| title   | avg_vote |
+---------+----------+
| Title 1 |   5.4000 |
| Title 2 |   4.2000 |
+---------+----------+

To help the JOIN in the subquery, you should have an index on votes over the columns (post_id, id).


Here's another solution that works without requiring a unique column:

SELECT T1.title, AVG(V.vote) AS avg_vote
FROM posts T1
JOIN (
    SELECT *
    FROM (
        SELECT v.*, @r := IF(@p = post_id, @r+1, 1) AS rownum, @p := post_id
        FROM (SELECT @p:=null, @r:=0) AS _init
        CROSS JOIN votes v
        ORDER BY v.post_id, v.id DESC
    ) AS t
    WHERE t.rownum <= 10
) AS V ON T1.id = V.post_id
GROUP BY T1.id;

The output is the same as the prior query.

Community
  • 1
  • 1
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Yeah, but I still need to pass alias to subsubquery, because `AVG` creates additional subquery. – Artem P Sep 20 '16 at 02:24
  • No you don't. Do the top-50 query per `votes.post_id` without reference to the outer query. Do it as a subquery in the `FROM` clause. Then join that to the `posts` table. – Bill Karwin Sep 20 '16 at 03:11
  • I did this and it's extremely slow 4.5s. – Artem P Sep 20 '16 at 17:13
  • The problem is: if I want to change this part `v1.id < v2.id` `id` to other column which has similar values and more than 10 rows contain same value, then all rows with this same value will be included in result. So this only working if column has unique values. – Artem P Sep 20 '16 at 17:53
  • Second query actually works. But one note: part `ORDER BY v.post_id` is required, so other order conditions must follow it. – Artem P Sep 22 '16 at 22:34