0

I have the following query, which correctly returns the user ID and two other pieces of information for a certain user just before a specific timestamp ('ta.Timestamp'). There are multiple rows for the same user before and after that, and I only want that specific one.

This works perfectly for a single user, but I can't seem to find a quick and efficient way to obtain the same results for groups of users (let's say all user IDs between X and Y). I am currently running it in a loop on Python but it is way too slow for 70k+ queries.

SELECT
  rm.userId,
  rm.info1,
  rm.info2
FROM request_metadata rm
LEFT JOIN
 (SELECT userId, Timestamp
  FROM table_approval
  WHERE userId = {}
  LIMIT 1) ta
ON rm.userId = ta.userId
WHERE rm.userId = {}
AND ta.Timestamp > rm.Timestamp
ORDER BY rm.itemId DESC
LIMIT 1;

I have tried different approaches (such as removing LIMIT and using GROUP BY), but I was not successful.

Any help is appreciated!

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Michele Tonutti
  • 4,298
  • 1
  • 21
  • 22
  • It's not clear from your query which table has multiple rows per user, because both tables have a userId column and a Timestamp column. It would be helpful if you create a SQLFiddle with a minimal reproducible example. – Bill Karwin Nov 28 '17 at 01:56
  • 1
    Anyway, this is a variation on the [greatest-n-per-group] question that comes up frequently on Stack Overflow. For this case, I'd use an extra outer join to do an exclusion join, looking for a counter-case (a row with a timestamp between the timestamp of `rm` and `ta`) and if the outer join matches no such row, then you've already found the closest match. See my answer to https://stackoverflow.com/questions/1313120/retrieving-the-last-record-in-each-group/1313293#1313293 – Bill Karwin Nov 28 '17 at 01:58

1 Answers1

0

I solved this like this, thanks to Bill's answer:

SELECT t1.*
FROM
(SELECT
rm.userItemId,
rm.info1,
rm.timestamp
FROM rm_table rm
LEFT JOIN ua_table ua
ON rm.userItemId = ua.userItemId
WHERE rm.timestamp < ua.timestamp
) t1
LEFT JOIN
(SELECT
rm.userItemId,
rm.info1,
rm.timestamp
FROM rm_table rm
LEFT JOIN ua_table ua
ON rm.userItemId = ua.userItemId
WHERE rm.timestamp < ua.timestamp
) t2
ON (t1.userItemId = t2.userItemId AND t1.timestamp < t2.timestamp)
WHERE t2.userItemId IS NULL
Michele Tonutti
  • 4,298
  • 1
  • 21
  • 22