0

I am trying to achieve the comment (title) written on the most recent day (review_date) from each user (username) in the database

My code is :

select tb1.*, tb2.* from 
(select username, via_mobile as pc, max(review_date) as pcdate from tripadvisor where username != "" and via_mobile='false' group by username) tb1
join 
(select username, via_mobile as mobile, max(review_date) as mobile from tripadvisor whereusername != "" and via_mobile='true' group by username) tb2
on tb1.username = tb2.username;

The problem is I cannot get the right review for the right date. For example :

username; review_date; title 
Alan; 2012-12-18 ;"If..."

But it should display Alan; 2012-12-18; "Nice hotel"

Can you help me to fix the code.

spencer7593
  • 106,611
  • 15
  • 112
  • 140

3 Answers3

1

Your question is a little unclear, but if I understand correctly, you're looking for each full row with the highest date, selected distinctly/grouped by username? This should give you that:

SELECT
  username,
  review_date,
  title
FROM
  tripadvisor AS ta
  INNER JOIN (
    SELECT
      username,
      max(review_date) AS review_date
    FROM
      tripadvisor
    GROUP BY
      username
  ) AS max_table
    ON ta.username = max_table.username
      AND ta.review_date = max_table.review_date;
WHERE
  username != ''
  -- any extra where clauses go here

See: How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?

Community
  • 1
  • 1
Liam Gray
  • 1,089
  • 9
  • 16
  • 1
    If there are two or more rows with the same most recent `review_date` for a given `username`, this query will return two or more rows for that username. This is probably an edge case. We probably expect `(username,review_date)` to be unique. But we don't see any guarantee of this. Also, the original query seems to limit the rows to those that have via_mobile='true' or via_mobile='false', excluding e.g. rows with via_mobile='unknown'. But those two minor issues aside, this query demonstrates the normative approach to returning the "latest row". +10 – spencer7593 Feb 22 '17 at 23:30
  • 1
    True, although if I'm honest I was very unclear about the pc/mobile differentiation and it seemed like a case of bad normalisation; unless OP is attempting to get the most recent date for mobile, and the most recent date for pc on the same row... I just wrote a query tailored to OP's result table, but your answer is more tailored to the way OP's written their query. +1 – Liam Gray Feb 22 '17 at 23:43
  • 1
    Yes, the specification is imprecise; there are several ways to interpret it, as to what row(s) should be returned under what circumstances. The intent of my comment was to point out some specific edge cases. I wasn't intending to be critical of the query in this answer. This query is a good demonstration, and is a lot cleaner than the query in my answer. I'd +10 again if SO would let me. – spencer7593 Feb 22 '17 at 23:49
1

If the goal is to return the most recent review title for "mobile" and for "pc", I'd do something like this:

SELECT q.username
     , MAX(q.pc_date)     AS pc_date
     , MAX(p.title)       AS pc_title
     , MAX(q.mobile_date) AS mobile_date
     , MAX(r.title)       AS mobile_title
  FROM ( SELECT t.username
              , MAX(IF(t.via_mobile='false',t.review_date,NULL) AS pc_date
              , MAX(IF(t.via_mobile='true',t.review_date,NULL) AS mobile_date
           FROM tripadvisor t
          WHERE t.username <> ''
            AND t.via_mobile IN ('true','false')
          GROUP
             BY t.username
       ) q
  LEFT
  JOIN tripadvisor p
    ON p.username    = q.username
   AND p.review_date = q.pc_date
   AND p.via_mobile  = 'false'
  LEFT
  JOIN tripadvisor r
    ON r.username    = q.username
   AND r.review_date = q.mobile_date
   AND r.via_mobile  = 'true'
 GROUP
    BY q.username

If the user has only "mobile" reviews and no "pc" reviews, this query will return a row, but with NULL values for the "pc" columns. Similarly, the query will return NULL values for the "mobile" columns for a user that has only "pc" reviews.

The query could easily be changed to only returns rows for users that have both "mobile" and "pc" reviews, to be closer to the original using the INNER JOIN.


If the goal is simpler, just to return just the most recent review...

SELECT r.username
     , r.review_date
     , MAX(r.title)      AS title
     , MAX(r.via_mobile) AS via_mobile
  FROM ( SELECT t.username
              , MAX(t.review_date) AS max_review_date
           FROM tripadvisor t
          WHERE t.username <> ''
            AND t.via_mobile IN ('true','false')
          GROUP
             BY t.username
       ) q
  JOIN tripadvisor r
    ON r.username    = q.username
   AND r.review_date = q.max_review_date
   AND r.via_mobile IN ('true','false')
 GROUP
    BY r.username
     , r.review_date

The results of this query are somewhat indeterminate when a username has multiple rows with identical (most recent) review_date. This guarantees a single row will be returned, but the title and via_mobile may not be from the same row.

spencer7593
  • 106,611
  • 15
  • 112
  • 140
0

Your question could be expressed as an existence test: show the rows for which the review date matches the latest review date for that user.

Existence is tested with EXISTS in a correlated subquery:

SELECT * FROM tripadvisor as T
where exists ( 
    select 1 from tripadvisor
    where username = T.username
    group by username
    having MAX(review_date) = T.review_date
);
James K. Lowden
  • 7,574
  • 1
  • 16
  • 31