In order to get the most recent record of a certain combination of identifiers, I use the following query:
SELECT t1.*
FROM (
SELECT id, b_id, c_id
FROM a
ORDER BY epoch DESC
LIMIT 18446744073709551615
) AS t1
GROUP BY t1.b_id, t1.c_id
If there are multiple records of a combination of b_id
+ c_id
, then it will always select the one with the highest value of epoch
(and as such, the latest in time).
The LIMIT
is added as a workaround to force MariaDB to actually order the results. I successfully use this construction a lot in my application, and so have others.
However, now I came across an exact same query in my application, where I "accidentally" used more columns than strictly necessary in the sub-query:
SELECT t1.*
FROM (
SELECT id, b_id, c_id, and, some, other, columns, ...
FROM a
ORDER BY epoch DESC
LIMIT 18446744073709551615
) AS t1
GROUP BY t1.b_id, t1.c_id
I've tested both queries. And the exact same query, but with as only change those additional columns, makes the result to become incorrect. In fact, the number of columns determines the result. If I have <= 28 columns, the result is okay. If I have 29 columns, then it gives the third-latest record (which is wrong too), and if I have 30-36 columns it always gives the second-latest record (36 is the total number for table a
). In my testing, it didn't seem to matter which particular column was removed or added.
I'm having a hard time finding out why exactly the behavior changes after I add more columns. Also, perhaps by chance, it still gave the correct result yesterday. But today suddenly the result changed, probably after new records (with unrelated identifiers) were added to table a
. I've tried using EXPLAIN
:
# The first query, with columns: id, b_id, c_id
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 280 Using where; Using temporary; Using filesort
2 DERIVED a ALL NULL NULL NULL NULL 280 Using filesort
# The second query, with columns: id, b_id, c_id, and, some, other, columns, ...
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 276 Using where; Using temporary; Using filesort
2 DERIVED a ALL NULL NULL NULL NULL 276 Using filesort
But that doesn't really help me much, other than that I can see that the key_len
is different. The second-latest record that is incorrectly received in the second query is one where id = 276
, the actual latest record that it correctly retrieves using the first query is one where id = 278
. In total there are 307 rows now, and yesterday perhaps just ~300. I'm not sure how to interpret these results to understand what is going wrong. Does anyone know? And if not, what else can I do to find out what is causing these strange results?