1

In my DB, there are two types of images: challenges, and answers. They both have lat/lng location columns. In this query, I wish to select the Challenge or Answer that the user created most recently via an INNER JOIN.

(The business logic behind this: we basically want to obtain a list of users which includes the user's last known location, which is determined by their most recent Challenge or Answer - whichever is most recent. If a user does not have a last known location, they should not be included in this list.)

I am getting [Err] 1054 - Unknown column 'U.Id' in 'where clause':

SELECT
    U.Id,
    U.TotalPoints,
    LastImage.Lat,
    LastImage.Lng
FROM User U
INNER JOIN
(
    SELECT Lat, Lng FROM
    (
        (SELECT Lat, Lng, CreatedOn FROM AnswerImage WHERE UserId = U.Id ORDER BY Id DESC LIMIT 1)
        UNION ALL
        (SELECT Lat, Lng, CreatedOn FROM ChallengeImage WHERE UserId = U.Id ORDER BY Id DESC LIMIT 1)
    ) LastImages ORDER BY CreatedOn DESC LIMIT 1
) LastImage
WHERE U.Type = 1 AND U.Status = 2
ORDER BY TotalPoints DESC;

I cannot seem to reference the User table (alias U) from within my derived 'LastImages' table (or whatever the proper term for it is).

Can anyone help? I've tried other methods, but none meet all my requirements:

  1. A row is only returned if the user has at least 1 challenge or 1 answer (hence the UNION)
  2. The most recent (as determined by ChallengeImage.CreatedOn and AnswerImage.CreatedOn) image is to be used in the join

Thanks!

Matthew Housser
  • 1,012
  • 9
  • 21

2 Answers2

2
SELECT u.id,
       u.TotalPoints,
       IF(IFNULL(a.CreatedOn, '1900-01-01') > IFNULL(c.CreatedOn, '1900-01-01'), a.Lat, c.Lat) AS Lat,
       IF(IFNULL(a.CreatedOn, '1900-01-01') > IFNULL(c.CreatedOn, '1900-01-01'), a.Lng, c.Lng) AS Lng
FROM User AS u
LEFT JOIN
    (SELECT UserId, Lat, Lng, CreatedOn
     FROM AnswerImage AS a
     JOIN (SELECT UserId, MAX(CreatedOn) AS CreatedOn
           FROM AnswerImage
           GROUP BY UserId) AS amax
     USING (UserId, CreatedOn)) AS a
    ON u.id = a.UserId
LEFT JOIN
    (SELECT UserId, Lat, Lng, CreatedOn
     FROM ChallengeImage AS c
     JOIN (SELECT UserId, MAX(CreatedOn) AS CreatedOn
           FROM ChallengeImage
           GROUP BY UserId) AS cmax
     USING (UserId, CreatedOn)) AS c
    ON u.id = c.UserId
WHERE U.Type = 1 AND U.Status = 2
    AND (a.Lat IS NOT NULL OR c.Lat IS NOT NULL)
ORDER BY TotalPoints DESC;

The subqueries are one of the common ways to get the last row per group in a table, see

Retrieving the last record in each group

Then you left join them with the User table so you'll get results even if a user doesn't have matches in both tables.

Community
  • 1
  • 1
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • I'll give it a shot, just trying to fix it up right now. Many typos such as table 'b' not existing, 'ChallengeImageImage', 'ChallenImage', etc. =P – Matthew Housser Jun 03 '14 at 16:48
  • This appears to give me back records for Users that have no Challenges and no Answers. – Matthew Housser Jun 03 '14 at 16:54
  • You can filter those out with `HAVING Lat IS NOT NULL` at the end. – Barmar Jun 03 '14 at 17:08
  • How does that compare with adding to the WHERE clause: `WHERE U.Type = 1 AND U.Status = 2 AND (a.Lat IS NOT NULL OR c.Lat IS NOT NULL)` – Matthew Housser Jun 03 '14 at 18:01
  • How would HAVING clauses meet my requirement #1? We have two LEFT JOINs, so wouldn't this still result in Users being returned that have neither a latest Challenge nor a latest Answer? – Matthew Housser Jun 03 '14 at 18:17
  • Doing it in the `WHERE` clause is good, too. Probably a little more efficient, since it won't waste time on rows with no matches in either table. – Barmar Jun 03 '14 at 19:42
  • I'll try this when I'm back home tonight, thanks. :) I'll accept your answer if everything goes OK. Should I be concerned about the performance of the inner `SELECT UserId, MAX(CreatedOn) AS CreatedOn FROM ChallengeImage GROUP BY UserId` queries? They select the entire table, which presumably could be millions of records long... – Matthew Housser Jun 03 '14 at 21:32
  • If you have an index on `UserId` it should be reasonably fast, if you have a composite index on `UserId, CreatedOn` it should be very fast. – Barmar Jun 04 '14 at 04:22
  • User.Id is a Primary Key, and ChallengeImage.UserId/AnswerImage.UserId are foreign keys and thus indices of their own. I'll consider indexing CreatedOn. On that note - everything seems to work. I fully understand your solution and the novel solution to 'grouping on max value'. Accepted! Thanks! – Matthew Housser Jun 04 '14 at 04:29
0

Your userid = u.id is at the wrong place indeed. Since you are pulling one record, what is reason for the order by in your last line? Please give this a try. untested :(

select u.id, u.totalpoints, a.lat, a.lng
from user u
join (
    select userid, lat, lng, createdon
    from AnswerImage
    union all
    select userid, lat, lng, createdon
    from ChallengeImage) a on a.userid = u.id
where u.type = 1 and u.status = 2
order by a.createdon desc
limit 1;
Fabricator
  • 12,722
  • 2
  • 27
  • 40
  • I'm not pulling only one record, I'm pulling a list of all users that have a last known location - 'last known location' being defined as the lat/lng info of their most recent Challenge or Answer, whichever is most recent. – Matthew Housser Jun 03 '14 at 16:46
  • Ah I see, Barmar's result's probably what you want then. – Fabricator Jun 03 '14 at 16:50