0

I have read here that MySQL processes ordering before applying limits. However, I receive different results when applying a LIMIT parameter in conjunction with a JOIN subquery. Here is my query:

SELECT 
    t1.id,
    (t2.counts / c.matches)
FROM
    table_one t1
        JOIN
    table_two t2 ON t1.id = t2.id
        JOIN
    (
        SELECT 
            t1.id, COUNT(DISTINCT t1.id) AS matches
        FROM
            table_one t1
        JOIN table_two t2 ON t1.id = t2.id
        WHERE
            t1.id IN (3390 , 3236, 148, 2811, 829, 137)
                AND t2.value_one <= 30
                AND t2.value_two < 2
        GROUP BY t1.id
        ORDER BY (t2.counts / matches)
        LIMIT 0, 50                        -- PROBLEM IS HERE (I think)
    ) c ON c.id = t1.id
ORDER BY (t2.counts / c.matches), t1.id;

Here is a rough description of what I think is happening:

  1. The sub-query selects a bunch of ids from table_one that meet the criteria
  2. These are ordered by (t2.counts / matches)
  3. The top 50 (in ascending order) are fashioned into a table
  4. This resulting table is then joined on the the id column
  5. Results are returned from the top level JOIN - without a GROUP BY clause this time. table_one is a reference table so this will return many rows with the same ID.

I appreciate that some of these joins don't make a lot of sense but I have stripped down my query for readability - it's normally quite chunky .

The problem is that when, I include the LIMIT parameter I get a different set of results and not just the top 50. What I want to do is get the top results from the subquery and use these to join onto a bunch of other tables based on the reference table.

Here is what I have tried so far:

  • LIMIT on the outer query (this is undesirable as this cuts off important information).
  • Trying different LIMIT tables and values.

Any idea what is going wrong, or what else I could try?

jda5
  • 1,390
  • 5
  • 17
  • ```.. WHERE t1.id IN (3390 , 3236, 148, 2811, 829, 137) AND t1.id NOT IN (43, 523) .. GROUP BY t1.id ..``` - (1) 2nd condition is obviously excess and should be removed (2) The subquery must produce 6 output rows only, so `LIMIT 0, 50` and `ORDER BY` in it makes no sense. – Akina May 18 '22 at 09:32
  • `.. GROUP BY t1.id HAVING COUNT(DISTINCT t1.id) < 7 ..` ?? `COUNT(DISTINCT t1.id)` must be 1 or NULL when `id=NULL`. – Akina May 18 '22 at 09:35
  • `t1.id` does not refer to the ID values of `table_one` - in fact since `table_one` is a reference table, it does not have an ID number. `t1.id` is never `NULL`. You are right though, the `HAVING COUNT(DISTINCT t1.id) < 7` is not necessary – jda5 May 18 '22 at 09:48
  • @Akina I'll edit my question. Thanks for pointing this out. Sadly this doesn't fix the issue I am having with the limits. – jda5 May 18 '22 at 09:49

1 Answers1

0

I have found a solution to my problem. It seems as if my matches column name does can't be used in my ORDER BY clause - which is weird since I don't get an error. Either way, this solves the problem:

SELECT 
    t1.id,
    (t2.counts / c.matches)
FROM
    table_one t1
        JOIN
    table_two t2 ON t1.id = t2.id
        JOIN
    (
        SELECT 
            t1.id, COUNT(DISTINCT t1.id) AS matches
        FROM
            table_one t1
        JOIN table_two t2 ON t1.id = t2.id
        WHERE
            t1.id IN (3390 , 3236, 148, 2811, 829, 137)
                AND t2.value_one <= 30
                AND t2.value_two < 2
        GROUP BY t1.id
        ORDER BY (t2.counts / COUNT(DISTINCT t1.id))    -- This line is changed
        LIMIT 0, 50
    ) c ON c.id = t1.id
ORDER BY (t2.counts / c.matches), t1.id;
jda5
  • 1,390
  • 5
  • 17