Complex MySQL with `group by` and `order by` or `max()`

Question

NOTE: Yes, there are many similar questions to this one. I would not bother posting a new question if any of them helped me understand how to correct my query.

This is a basic MySQL threaded messaging project, with 3 tables: messages, messages_people, and people. Each message can have multiple people attached to it. Each person can have multiple messages.

messages < messages_people > people

I'm looking for a single query that returns a list of THREADS (not all messages, just the last message in each thread a person is a part of), with a bunch of information about the LAST message in each thread. The following schema and data ...

http://sqlfiddle.com/#!9/f724ca/8

... includes a query that selects the correct rows (with only a few fields, for brevity) for a particular person (person_id = 1). Note: Change person_id to see other results.

Prior to MySQL 5.7, the following query would show the most recent message (highest id) in any thread a single person was a part of:

SELECT m.`id`, m.`thread_id`, m.`message`
FROM `messages_people` AS mp
LEFT JOIN `messages` AS m ON (m.`id` = mp.`message_id`)
WHERE mp.`person_id` = 1
GROUP BY m.`thread_id`
ORDER BY mp.`id` DESC;

But MySQL 5.7 changed how GROUP BY works in a way I don't yet understand. None of the online explanations show any before/after examples, which is how I learn best. The following query does work ...

SELECT m.`id`, m.`thread_id`, m.`message`
FROM `messages` AS m
WHERE m.`id` IN (
    SELECT MAX(m.`id`) AS `id`
    FROM `messages` AS m
    WHERE m.`thread_id` IN (
        SELECT DISTINCT(m.`thread_id`)
        FROM `messages_people` AS mp
        LEFT JOIN `messages` AS m ON (m.`id` = mp.`message_id`)
        WHERE mp.`person_id` = 1
    )
    GROUP BY m.`thread_id`
);

... however the idea of so many nested queries makes me vomit in my mouth a little. It's clearly neither correct nor performance-efficient.

I know I can tweak my.cnf to get pre 5.7 GROUP BY functionality by setting sql_mode=only_full_group_by to sql_mode='', but I'd rather learn the correct approach than be forced to write unportable code.

What's the correct way of using GROUP BY with ORDER BY in a single query with MySQL 5.7?

`It's clearly neither correct nor performance-efficient.` you wrote it did work, so how come it is not correct? How do you know it is not performance efficient? Have you tried any of the other techniques listed in the duplicate question's answers? There are 5 different solutions there at least that can be used in v5.7. — Shadow, Jul 27 '20 at 22:18
You're right, that's an assumption based on empirical evidence that nested queries tend to take longer. It's entirely possible that a better-written query would take just as long. — designosis, Jul 27 '20 at 22:30
It's a bit frustrating having my question – one I spent a long time formulating – closed within a minute. Twice. Stackoverflow has changed. I'd suggest there are worse things than having several differently-worded questions with similar answers. — designosis, Jul 27 '20 at 22:37
Differently worded questions does not mean that the questions are different. Since you are asking how to select top record from a group, your question is the same as the other one, just with different table and field names. Ask a question that is different. For example, you could try to implement one of the other solutions suggested in the duplicate question. If you get stuck with that attempt, you can ask for help with that specific problem. Btw, closing duplicate questions has been a feature for quite a while, so cannot really understand your comnent around SO has changed. — Shadow, Jul 27 '20 at 22:42
Fair enough. My point about duplicate questions is that – given how rapidly practices change – old answers (and that one is 10 years old) are 90% obsolete or incorrect, which forces a ton of reading to get anything of use (e.g. there's no mention of MySQL 5.7). It's often the case that newly worded questions and answers are more relevant to my searches, so my sense is that new question/answers help both search engines and end users. I may be wrong. Either way, I appreciate the advice. I'm sure I'll figure something out. — designosis, Jul 27 '20 at 22:58
This partidular question has been kept up to date (has solution with window functions for mysql v8.0). The rest of the answers mostly use basic set algebra, so do not get invalidated over time. If you know there are older questions relating your one, then you need to validate them, you cannot assume (again) that those are automatically outdated. — Shadow, Jul 27 '20 at 23:04
I agree that the business of identifying duplicate questions (while brilliant in its conception) is flawed in its execution. That said, the premise of your question is also flawed. Your query worked, but it was never valid sql; not now, and not 10 years ago. It relied upon an undocumented quirk in MySQL. Anyway, there are dozens of solutions to this problem (including those described in the manual under its chapter dedicated to this very topic) — Strawberry, Jul 28 '20 at 07:27
In the past, on seeing a post asking a question in my area of knowledge, it has given me a sense of warmth and purpose to help with an answer. I think what's really going on is that I'm just now discovering how much I've relied on people on stackoverflow being excited to solve little problems, to share their understanding with others. Reputation points are such a simple and sweet incentive for peer reviewed collaboration. — designosis, Jul 28 '20 at 18:39
I am not a smart person, complex SQL queries fry my circuits. I've spent ~14 hours reading posts, trying variations, and struggling with what should be a simple query. So I imagined that this community ... a place where people love answering simple questions ... would have more to offer than RTFM. I suppose every social network evolves, becoming something else. It's just disheartening. — designosis, Jul 28 '20 at 18:39
This might help : https://stackoverflow.com/a/68894680/13457138 — , Aug 24 '21 at 11:07

Complex MySQL with `group by` and `order by` or `max()`

0 Answers0