It doesn't work as you think it should and the documentation explains the meaning of DISTINCT
: it's about distinct rows:
The ALL
and DISTINCT
options specify whether duplicate rows should be returned. ALL
(the default) specifies that all matching rows should be returned, including duplicates. DISTINCT
specifies removal of duplicate rows from the result set. It is an error to specify both options. DISTINCTROW
is a synonym for DISTINCT
.
(source: http://dev.mysql.com/doc/refman/5.7/en/select.html)
You need to group the rows by user in order to get a single row for each user but, unfortunately, you cannot get their most recent score this way.
You can get the maximum, minimum, average score and other computed values. Check the list of GROUP BY
aggregate functions.
The query
This is the query that gets the values you need:
SELECT u.fsname, u.emailaddress, la.score
FROM users u
INNER JOIN attempts la # 'la' from 'last attempt'
ON u.emailaddress = la.emailaddress
LEFT JOIN attempts mr # 'mr' from 'more recent' (than last attempt)
ON la.emailaddress = mr.emailaddress AND la.datetime < mr.datetime
WHERE mr.datetime IS NULL
How it works
It joins table users
(aliased as u
) with table attempts
(aliased as la
, short for "last attempt") using emailaddress
as the matching column. It's the join you already have in your query, I added the aliases because they help you write less from that point on.
Next, it joins the attempts
table again (aliased as mr
from "more recent than the last attempt"). It matches each attempt from la
with all the attempts from mr
of the same user (identified by their emailaddress
) and that have a more recent datetime
. The LEFT JOIN
ensures that each row from la
matches at least one row from mr
. The rows from la
that do not have a match in mr
are the rows that have the biggest values of datetime
for each emailaddress
. They are matched with rows full of NULL
(for the mr
part).
Finally, the WHERE
clause keeps only the rows that have NULL
in the datetime
column of the row selected from mr
. These are the rows that matched the most recent entries from la
for each value of emailaddress
.
Performance remarks
In order to run fast this query (any query!) needs indexes on the columns used in the JOIN
, WHERE
, GROUP BY
and ORDER BY
clauses.
You should not use emailaddress
in table attempts
to identify the user. You should have a PK
(primary key) on table users
and use that as a FK
(foreign key) in table attempts
(and other tables that refer to a user). If emailaddress
is the PK
of table users
change it to an UNIQUE INDEX
and use a new INTEGER AUTO INCREMENT
ed column userId
as PK
instead. The indexes on numeric columns are faster and use less space than the indexes on string columns.