Running multiple queries in MySQL without using sub-query

Question

I have two tables, one table called users with,

fsname
emailaddress

and second table called attempts with emailaddress, score and datetime.

Now what I wanted to do is first order the attempts table by datetime and then pick then join the attempt table with users table if they have same emailaddress and then pick the final attempts of each unique user.

In short, I have to pick the last attempt of each user by joining these table and this is the query that I have generated to achieve this,

$query = "SELECT 
            distinct users.fsname, users.emailaddress, attempts.score 
            FROM users 
            INNER JOIN attempts 
            ON users.emailaddress = attempts.emailaddress 
            ORDER BY datetime DESC";

This query first suppose to order the attempts by datetime and only pick values with distinct first name which is fsname over here.

but When I execute the above query it returns the result with having non-unique values of fsname eventhough I am using DISTINCT with fsname.

Can anyone please tell me why DISTINCT is not working to only pick distinct fsname ?

I have tried both DISTINCT fsname and DISTINCT(fsname) but none of these are working.

That query is showing distinct `fsname`, `emailaddress`, `score`. If `fsname` has more than one `emailaddress` or `score`, you will have more that 1 row for `fsname`. — CrApHeR, May 21 '15 at 17:38
`DISTINCT` applies to the entire `SELECT` list. If you want to make just a single column distinct, use `GROUP BY` — Barmar, May 21 '15 at 17:39
Your distinct condition is about all the selected fields (users.fsname, users.emailaddress, attempts.score) and not about the first one. You should group by users.fsname, but it will not use the latest attempts.score — sdespont, May 21 '15 at 17:40

score 3 · Accepted Answer · answered May 21 '15 at 18:13

It doesn't work as you think it should and the documentation explains the meaning of DISTINCT: it's about distinct rows:

The ALL and DISTINCT options specify whether duplicate rows should be returned. ALL (the default) specifies that all matching rows should be returned, including duplicates. DISTINCT specifies removal of duplicate rows from the result set. It is an error to specify both options. DISTINCTROW is a synonym for DISTINCT.

(source: http://dev.mysql.com/doc/refman/5.7/en/select.html)

You need to group the rows by user in order to get a single row for each user but, unfortunately, you cannot get their most recent score this way. You can get the maximum, minimum, average score and other computed values. Check the list of GROUP BY aggregate functions.

The query

This is the query that gets the values you need:

SELECT u.fsname, u.emailaddress, la.score 
FROM users u
INNER JOIN attempts la                # 'la' from 'last attempt'
    ON u.emailaddress = la.emailaddress
LEFT JOIN attempts mr                 # 'mr' from 'more recent' (than last attempt)
    ON la.emailaddress = mr.emailaddress AND la.datetime < mr.datetime
WHERE mr.datetime IS NULL

How it works

It joins table users (aliased as u) with table attempts (aliased as la, short for "last attempt") using emailaddress as the matching column. It's the join you already have in your query, I added the aliases because they help you write less from that point on.

Next, it joins the attempts table again (aliased as mr from "more recent than the last attempt"). It matches each attempt from la with all the attempts from mr of the same user (identified by their emailaddress) and that have a more recent datetime. The LEFT JOIN ensures that each row from la matches at least one row from mr. The rows from la that do not have a match in mr are the rows that have the biggest values of datetime for each emailaddress. They are matched with rows full of NULL (for the mr part).

Finally, the WHERE clause keeps only the rows that have NULL in the datetime column of the row selected from mr. These are the rows that matched the most recent entries from la for each value of emailaddress.

Performance remarks

In order to run fast this query (any query!) needs indexes on the columns used in the JOIN, WHERE, GROUP BY and ORDER BY clauses.

You should not use emailaddress in table attempts to identify the user. You should have a PK (primary key) on table users and use that as a FK (foreign key) in table attempts (and other tables that refer to a user). If emailaddress is the PK of table users change it to an UNIQUE INDEX and use a new INTEGER AUTO INCREMENTed column userId as PK instead. The indexes on numeric columns are faster and use less space than the indexes on string columns.

I haven't seen this nice of an answer in a long time. Great explanation, very easy to follow. — AdamMc331, May 21 '15 at 18:36
Hello, first of all thanks a lot for your answer...I have been trying to understand your answer eversince you posted it but I just have one problem undetstanding it. shouldn't `la.emailaddress` and `mr.emailaddress` are the same tables in same order ? — Riley Willow, May 21 '15 at 18:57
You can easily see how the query works: add `la.emailaddress, la.datetime, mr.datetime` to the `SELECT` clause, remove the `mr.datetime IS NULL` condition and put instead a condition that limits the number of returned users to only 3-4 and add `ORDER BY u.emailaddress, la.datetime, mr.datetime`. This way you can see how the tables are joined. — axiac, May 21 '15 at 19:18

Running multiple queries in MySQL without using sub-query

1 Answers1

The query

How it works

Performance remarks

Linked