Optimizing Query. Want to pick the last record without using max in sub-query

Question

This is my query:

SELECT B.RECORDID, A.ITEMCODE, A.ITEMNAME, A.STOCKINHAND, B.SALEPRICE 
FROM ITEMMASTER A, STOCKENTRY B 
WHERE A.ITEMID = B.ITEMID 
  AND RECORDID = (SELECT MAX(RECORDID) FROM STOCKENTRY 
                  WHERE ITEMID = A.ITEMID) 
  AND A.STOCKINHAND > 0 
  AND B.SALEPRICE > 0 
  AND B.INVOICEDATE IS NOT NULL 
ORDER BY A.ITEMNAME, B.INVOICEDATE;

Table B (StockEntry) may contain one or more records whereas Table A (ItemMaster) will definitely have only a single row for that ItemID.

If I remove the sub-query in the WHERE clause, it displays one or more rows. I feel that picking max(RecordID) through sub-query in the WHERE clause is slowing query. I do have indexes on RecordID, InvoiceDate, ItemID but still MySQL logs show that this query is not performing well. I can't change the column sequence for some reason.

Is there any better way to optimize this query?

Not sure if the indexes in `MySQL` work similar to `SQL Server`, but for `SQL Server` I would add a covering index on Stockentry for `ItemID, RecordID` which should speed up the subselect substantially. — JNK, Aug 05 '11 at 15:15
Do you have an average time of how long your current query takes? — ace, Aug 05 '11 at 15:24
@ace: Here is a link to my question at dba.stackexchange.com where I have posted query log: http://dba.stackexchange.com/questions/4358/improving-slow-queries-slow-query-log-attached — RKh, Aug 05 '11 at 15:30

score 6 · Accepted Answer · edited May 23 '17 at 12:03

6

It's likely slow because it's running a correlated subquery for every row of the outer query. There are two solutions that tend to run more efficiently.

One is to use a derived table, which uses a subquery, but it only executes the subquery once to prepare the derived table.

SELECT B.RECORDID, A.ITEMCODE, A.ITEMNAME, A.STOCKINHAND, B.SALEPRICE 
FROM ITEMMASTER A
JOIN STOCKENTRY B ON A.ITEMID = B.ITEMID
JOIN (SELECT ITEMID, MAX(RECORDID) AS MAXRECORDID 
      FROM STOCKENTRY GROUP BY ITEMID) M
  ON (M.ITEMID, M.MAXRECORDID) = (B.ITEMID, B.RECORDID)
WHERE A.STOCKINHAND > 0 
  AND B.SALEPRICE > 0 
  AND B.INVOICEDATE IS NOT NULL 
ORDER BY A.ITEMNAME, B.INVOICEDATE;

The other solution is to use an exclusion join to find the row in B such that no other row exists with the same itemid and a greater recordid. With correct indexes (e.g. a compound index on (ITEMID, RECORDID), this should perform very well.

SELECT B.RECORDID, A.ITEMCODE, A.ITEMNAME, A.STOCKINHAND, B.SALEPRICE 
FROM ITEMMASTER A
JOIN STOCKENTRY B ON A.ITEMID = B.ITEMID 
LEFT OUTER JOIN STOCKENTRY B2
  ON B.ITEMID = B2.ITEMID AND B.RECORDID < B2.RECORDID
WHERE B2.ITEMID IS NULL 
  AND A.STOCKINHAND > 0 
  AND B.SALEPRICE > 0 
  AND B.INVOICEDATE IS NOT NULL 
ORDER BY A.ITEMNAME, B.INVOICEDATE;

This type of problem comes up frequently on Stack Overflow. I've added the greatest-n-per-group tag to the question so you can see other cases.

Re @RPK's comment:

I don't use MySQL QB myself, and that app has changed so many times I can't advise on how to use it. But in the mysql monitor (command-line), I use a combination of EXPLAIN and PROFILING to give me stats.

However, you made a comment about not being to modify (or create?) indexes. That's going to hamstring your attempts to optimize.

edited May 23 '17 at 12:03

Community

1
1

answered Aug 05 '11 at 15:22

Bill Karwin

538,548
86
673
828

+1, using new join syntax and removing the correlated subquery for every row. – KM. Aug 05 '11 at 15:24
@Bill: There is a correction in your first query. After WHERE you have left AND. – RKh Aug 05 '11 at 15:38
@Bill: Among the two queries which one you recommend? Is there any Query Analyzer tool with which I get actual stats? I am using MySQL Query Browser but it caches the results so I am not getting actual time taken. – RKh Aug 05 '11 at 15:47
See above amendment to my answer. – Bill Karwin Aug 05 '11 at 16:10
@Bill: I have some confusion on using TOP or LIMIT 1 over existing sub-query in my original post. Even if I use TOP or LIMIT, the query has to retrieve the entire set of records. How it is fast when compared to max(RecordID)? – RKh Aug 06 '11 at 16:51
1

@RPK: Depends on the internal implementation of the RDBMS. When you compare TOP to LIMIT, you're talking about Microsoft SQL Server vs. MySQL, and these are two completely different technologies. They each might optimize in different ways, so they might be slower or faster than using MAX(RecordID). Also presence of indexes, and choice of storage engine make optimization different. So there's no simple answer. – Bill Karwin Aug 06 '11 at 22:10
@Bill: Leave MySQL and consider SQL Server only. Is something like Top 1 faster than max(RecordID)? In both case, you need to fetch the entire set of records. – RKh Aug 08 '11 at 06:31
@RPK: Not necessarily. If RecordID is indexed, it should be a quick operation to get the first (MIN) or last (MAX) value in the index B-tree. I don't know specifically if SQL Server can optimize this. – Bill Karwin Aug 08 '11 at 06:51

score 1 · Answer 2 · answered Aug 05 '11 at 15:14

1

try using TOP 1 ... ORDER BY .. DESC, like this:

SELECT B.RECORDID, A.ITEMCODE, A.ITEMNAME, A.STOCKINHAND, B.SALEPRICE 
FROM ITEMMASTER A, STOCKENTRY B 
WHERE A.ITEMID = B.ITEMID 
  AND RECORDID = (SELECT top 1 RECORDID FROM STOCKENTRY 
                  WHERE ITEMID = A.ITEMID
                  order by RECORDID desc) 
  AND A.STOCKINHAND > 0 
  AND B.SALEPRICE > 0 
  AND B.INVOICEDATE IS NOT NULL 
ORDER BY A.ITEMNAME, B.INVOICEDATE;

answered Aug 05 '11 at 15:14

KM.

101,727
34
178
212

I answered before MySql tag was added to the question, `TOP` is not MySql syntax – KM. Aug 05 '11 at 15:17
@KM: MySQL is not recognizing "Top 1". – RKh Aug 05 '11 at 15:24
@RPK, see my previous comment, try something like: `(select RECORDID FROM STOCKENTRY limit 0,1 WHERE ITEMID = A.ITEMID order by RECORDID desc)` but [Bill Karwin's answer](http://stackoverflow.com/questions/6958555/optimizing-query-want-to-pick-the-last-record-without-using-max-in-sub-query/6958686#6958686) will preform **much** better. It is better to final all the "max" values in one query and join to them than to find the max one at a time for each row. – KM. Aug 05 '11 at 15:28

score 1 · Answer 3 · edited May 23 '17 at 12:26

1

TOP is database-specific; you may want to use the MySQL alternative ORDER BY ... DESC LIMIT 1.

This SO post has a good overview of the different ways to implement LIMIT concept across databases.

edited May 23 '17 at 12:26

Community

1
1

answered Aug 05 '11 at 15:22

eykanal

26,437
19
82
113

true. limit 1 works in MySQL. But I am not able to determine exact performance of queries as query results probably got cached. – RKh Aug 05 '11 at 15:28
LIMIT is also database specific. – Brian Hoover Aug 05 '11 at 15:28
FWIW, `TOP` is supported by Sybase and Microsoft SQL Server. `LIMIT` is supported by MySQL, PostgreSQL, and SQLite. – Bill Karwin Aug 05 '11 at 15:44

score 1 · Answer 4 · answered Aug 05 '11 at 15:37

My suggestion is create a view

CREATE VIEW `STOCKENTRY_V` AS 
SELECT ITEMID,MAX(RECORDID) AS RECORDID
FROM STOCKENTRY
GROUP BY ITEMID;

And you can do a simple join on 2 tables + the view. I'm interested how fast it will perform.

SELECT B.RECORDID, A.ITEMCODE, A.ITEMNAME, A.STOCKINHAND, B.SALEPRICE 
FROM ITEMMASTER A, STOCKENTRY B, STOCKENTRY_V C
WHERE A.ITEMID = B.ITEMID AND AND B.ITEMID = C.ITEMID
  AND B.RECORDID = C.RECORDID
  AND A.STOCKINHAND > 0 
  AND B.SALEPRICE > 0 
  AND B.INVOICEDATE IS NOT NULL 
ORDER BY A.ITEMNAME, B.INVOICEDATE;

score 1 · Answer 5 · answered Aug 05 '11 at 15:56

1

If the query is frequently used and performance is still an issue, you could create a table of the last record id for an item and keep it up to date using a trigger on the ITEMMASTER table.

answered Aug 05 '11 at 15:56

Kevin Burton

11,676
2
24
37

Optimizing Query. Want to pick the last record without using max in sub-query

5 Answers5