-1

The following MySQL query is suppose to rank the posts according to their views + rating + submit date in an ascending order:

select
    cat                    ,
    p.id                   ,
    title                  ,
    p.date                 ,
    shares                 ,
    source                 ,
    cat                    ,
    count(v.post_id) views ,
    sum(r.ilike)     rating,
    r.module               ,
    r.module_id            ,
    @Rank := @Rank + 1 AS Rank
from
    posts p
    JOIN
        rates r
        on
            r.module_id  = p.id
            AND r.module = 'posts'
    JOIN
        posts_views v
        on
            v.post_id = p.id
WHERE
    p.date     <= UNIX_TIMESTAMP(NOW())
    AND p.state = '3'
    AND
    (
        p.cat NOT REGEXP '[[:<:]]15[[:>:]]'
    )
GROUP BY
    r.module_id
ORDER BY
    rating DESC       ,
    views DESC        ,
    p.date ASC LIMIT 0,     10

Gives the following result: query result screen shot

We have 3 problems in the result:

  1. the views column values are doubled
  2. the rating column values are copying the views' value
  3. The Rank column in NULL
JNevill
  • 46,980
  • 4
  • 38
  • 63
Xenotic
  • 51
  • 1
  • 9

1 Answers1

0

The query is generating a semi-Cartesian product. With multiple matching rows from r and multiple matching rows from v, those rows are getting matched together, inflating the results for rating and views. If we remove the GROUP BY and the aggregate functions, and get detail rows back, we can observe the "duplicate" rows that are causing the views count to be doubled, tripled...

One fix for this is to avoid the Cartesian product, by pre-aggregaing from at least one of the child tables in an inline view. Then we join the derived table to the posts table to get the aggregate to the outer query.

We probably want to consider using an outer join to handle the condition when there are no matching rows in views or rates, so we can return zero count for posts that don't have any views.

Initialize user defined variables, either as a separate statement, or within an inline view.

Also, we want to qualify all column references, both as an aid to the future reader (not force them to look at table definitions to figure out which table a column like cat or title or source is coming from), and to avoid the query breaking with an "ambiguous column" error, in the future when a column of the same name is added to one of the tables referenced in the query.

I suggest something like this:

SELECT p.cat
     , p.id
     , p.title
     , p.date
     , p.shares
     , p.source
     , p.cat
     , IFNULL(v.cnt_views,0)  AS views
     , r.tot_rating           AS rating
     , r.module
     , r.module_id
     , @Rank := @Rank + 1     AS Rank

  FROM ( SELECT @Rank := 0 ) i

 CROSS
  JOIN posts p

  LEFT
  JOIN ( SELECT ra.module_id
              , MAX(ra.module)   AS module
              , SUM(ra.ilike)    AS tot_rating
           FROM rates ra
          WHERE ra.module = 'posts'
          GROUP
             BY ra.module_id
       ) r
    ON r.module_id = p.id

  LEFT
  JOIN ( SELECT pv.post_id
              , SUM(1)    AS cnt_views
           FROM posts_views pv
          GROUP
             BY pv.post_id
       ) v
    ON v.post_id = p.id

 WHERE p.date <= UNIX_TIMESTAMP(NOW())
   AND p.state = '3'
   AND p.cat NOT REGEXP '[[:<:]]15[[:>:]]'

 ORDER
    BY r.tot_rating DESC
     , v.cnt_views  DESC
     , p.date ASC
 LIMIT 0, 10
spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '@Rank := 0) i CROSS JOIN posts p LEFT JOIN ( SELECT ra.module_id, MAX(ra.module' at line 1 – Xenotic Mar 04 '19 at 18:55
  • The `SELECT` keyword is missing in the inline view `i`. corrected. – spencer7593 Mar 04 '19 at 19:54