I am trying (and failing) to join some tables in a SQLite database. The data itself is complicated but I think I have boiled it down to an illustrative example.
Here are the three tables I want to join.
Table: Events
+----+---------+-------+-----------+
| id | user_id | class | timestamp |
+----+---------+-------+-----------+
| 1 | 'user1' | 6 | 100 |
| 2 | 'user1' | 12 | 400 |
| 3 | 'user1' | 4 | 900 |
| 4 | 'user2' | 6 | 400 |
| 5 | 'user2' | 3 | 800 |
| 6 | 'user2' | 8 | 900 |
+----+---------+-------+-----------+
Table: Games
+---------+---------+------------+-----------+
| user_id | game_id | game_class | timestamp |
+---------+---------+------------+-----------+
| 'user1' | 1 | 'A' | 200 |
| 'user2' | 2 | 'A' | 300 |
| 'user1' | 3 | 'B' | 500 |
| 'user1' | 4 | 'A' | 600 |
| 'user1' | 5 | 'A' | 700 |
+---------+---------+------------+-----------+
Table: AScores
+---------+-------+
| game_id | score |
+---------+-------+
| 1 | 8 |
| 2 | 2 |
| 4 | 9 |
| 5 | 6 |
+---------+-------+
I would like to join these to provide an additional column on the first table containing the users current score in game class A at the time of the event. I.e. I would like theresult of the join to look like this:
Desired Result
+----+----------+-------+-----------+-----------------+
| id | user_id | class | timestamp | current_a_score |
+----+----------+-------+-----------+-----------------+
| 1 | 'user1' | 6 | 100 | (null) |
| 2 | 'user1' | 12 | 400 | 8 |
| 3 | 'user1' | 4 | 900 | 6 |
| 4 | 'user2' | 6 | 400 | 2 |
| 5 | 'user2' | 3 | 800 | 2 |
| 6 | 'user2' | 8 | 900 | 2 |
+----+----------+-------+-----------+-----------------+
The following simple join pulls together the two tables AScores and Games.
SELECT * FROM AScores
INNER JOIN Games
ON AScores.game_id = Games.game_id
And so I was hoping to join this to the Events table as a sub-query. Something like this:
SELECT Events.*, AScoredGames.time_stamp AS game_time_stamp, AScoredGames.score
FROM Events
LEFT OUTER JOIN (
SELECT AScores.score, Games.* FROM AScores
INNER JOIN Games
ON AScores.game_id = Games.game_id
) AS AScoredGames
ON Events.user_id = AScoredGames.user_id
AND Events.time_stamp >= AScoredGames.time_stamp
ORDER BY Events.time_stamp ASC
That results in the following:
+----+---------+-------+------------+-----------------+-------+
| id | user_id | class | time_stamp | game_time_stamp | score |
+----+---------+-------+------------+-----------------+-------+
| 1 | user1 | 6 | 100 | NULL | NULL |
| 2 | user1 | 12 | 400 | 200 | 8 |
| 4 | user2 | 6 | 400 | 300 | 2 |
| 5 | user2 | 3 | 800 | 300 | 2 |
| 6 | user2 | 8 | 900 | 300 | 2 |
| 3 | user1 | 4 | 900 | 200 | 8 |
| 3 | user1 | 4 | 900 | 600 | 9 |
| 3 | user1 | 4 | 900 | 700 | 6 |
+----+---------+-------+------------+-----------------+-------+
So I need to group by Events.id to get rid of the triplicated row with Events.id 3. But what I want to do is to choose the row with the maximum game_time_stamp but then use the row's score. If I do MAX(game_time_stamp) as my aggregation I still have to independently aggregate the score. Is there a way to tie the row choice in the score column's aggregation function to the result of the game_time_stamp column's aggregation function?
(N.B. Existing answers to questions like Select first record in a One-to-Many relation using left join and SQL Server: How to Join to first row seem to suggest I cannot and say one must use a WHERE clause over a sub-query. But I am struggling with that (I'll post another question about that) and I can think of at least one solution and I am hoping there are better ones.)