5

I have a problem with a relatively simple query and the execution plan Access choose for it.

The query is of this form

SELECT somethings
FROM A INNER JOIN (B INNER JOIN (C INNER JOIN D ON ...) ON ...) ON ...
WHERE A.primaryKey= 1 AND D.d = 2;

C and D have relatively few rows. A and B have a few thousands rows.

The query, which returns 2 rows (not sure if this is pertinent) is really slow. It runs in 17 seconds. If I remove the AND D.d = 2 part of the where clause, the query now returns 4 rows and run instantly.

So my understanding is that the JET engine could run the query without the filter on D.d instantly, then execute the said filter instantly (only 4 rows to filter). Therefor it should not be too much longer to run the query with the D.d = 2 filter.

I tried to create a sub query without the filter and include this in another query that would just filter the result, but it's still slow. My guess is that the JET engine is smart enough to "flatten" the sub-queries so the result is the same.

Since I was unable to make the query run as I wished I used the JETSHOWPLAN thingy so that Access would output it's execution plan. Here is what I found:

For the fast query (the one without D.d = 2) the first step of the query plan is to apply the A.primaryKey = 1 filter on the A table. This result in a data set of 1 row out of more than 30000. Then the joins seems to be executed from A to D using index with a data set that never goes over 4 rows.

The slow query seems to be executed in the revers order. D and C are joined first then D.d = 2 is tested. After that, the joins from C to A are executed. By doing this this way the data that needs to be joined from D to C, from C to B and from B to A is much larger. When all the JOIN are executed and before A.primaryKey=1 is executed the data set will have 120K rows.

Is there a way I could force the right query plan on Access?

I hope I was clear. Let me know if I should post the query plans. I did not because they are quite large.

Thanks in advance,

mp

Mathieu Pagé
  • 10,764
  • 13
  • 48
  • 71
  • 2
    Since you can't provide hints to the query planner, I suspect you're SOL. If this is performance-critical, you could append the fast part of the query into a scratch table and use it for another `D.d = 2` query. I know that sound nasty (it is!), but I don't know what else you could do other than live with the slowness of the single query you have now. – HansUp Jan 27 '12 at 17:43
  • @HansUp: Thanks for your input. I feared I will have to use such an ugly hack, but if I can't find any other solution I will have to use one. My users are waiting for the result of this query a couple times per day and 17 seconds is a long time when all you do is stare at the screen. – Mathieu Pagé Jan 27 '12 at 17:59

2 Answers2

2

Do it in VBA code? The idea would be to take out the part that's slow and execute the fast-returning query, then append the slow part in sql.

db.execute "select * from qryFast inner join d on qryfast.dkey = d.d where d.d = 2

No, VBA code in a module is different from a sub-query. @HansUp has clarified for us that executing the code in one step, as I've shown above, won't improve the performance. You should be able to get the results in memory quickly, if you're familiar with writing code in modules, but then getting the output where you need it to go might slow you down more.


in other words, you should be able to get the results of qryFast into a recordset in memory quickly, and then apply a filter on qryFast.dkey = d, and also get a recordset quickly from 'select * from tableD where d=2' to look up the related info you want from tableD, but getting all that stuff out of memory and to a place where your front-end can access it might take longer than the 17 seconds they're waiting now.


In fact, it might kick it in the pants enough if you change qryFast to include a condition where dkey = 2 (or whatever the pk is on tableD)


another idea: have 3 queries, qryFast, qryD, and qryFastWithD joining the two. I'm just tossing out ideas, here.


or, as you say in your comments, try containing different parts of the query in sub-queries, but I would think the optimizer wouldn't be fooled by such a trick, if moving a piece of it into a sub-query didn't work. By all means, whatever works, take it.

Beth
  • 9,531
  • 1
  • 24
  • 43
  • No, haven't tried, just guessing the optimizer might not extend into VBA. Instead of doing it all in one SQL statement as I've shown, he could try breaking it into multiple steps in VBA. – Beth Jan 27 '12 at 18:12
  • Unless I miss something it's the same thing I already tried. I created a subquery and used this as the source of another query that simply SELECT * FROM qryFast WHERE d = 2. – Mathieu Pagé Jan 27 '12 at 18:13
  • 1
    @Beth: All SQL query, whether they came from a stored query or are generated in VBA are still executed by the JET engine... I think. Right... ? – Mathieu Pagé Jan 27 '12 at 18:15
  • VBA will hand off the SQL statement to the db engine. The engine will do its thing using its query planner. No escape! :-) – HansUp Jan 27 '12 at 18:15
  • Yes, of course the JET engine executes VBA code. If he can do the join in memory instead of a temp table, he can get around the slow part. – Beth Jan 27 '12 at 18:17
  • 1
    Could you restate that last one, Beth? I'm intrigued, but it's just not sinking in. And my head is starting to hurt. – HansUp Jan 27 '12 at 18:20
  • I finally got it to work by mixing things up until the query planner agreed with me. Your 3 queries idea gave me the idea to isolate the "A.primaryKey= 1" in a sub-query to ensure it's executed before A is joined to B. It's something like this : "SELECT ... FROM (SELECT ... FROM A WHERE a.primaryKey=1) AS qryA INNER JOIN B ... WHERE D.d = 2;" If you integrate this in your answer I'll accept it. Otherwise I'll post it as my own answer. – Mathieu Pagé Jan 27 '12 at 19:32
  • @HansUp: Thanks for discussing this with us. – Mathieu Pagé Jan 27 '12 at 19:35
  • @MathieuPagé Well done, Sir. And thanks for asking this question. – HansUp Jan 27 '12 at 19:40
2

I finally got it to work by mixing things up until the query planner agreed with me. I isolated the "A.primaryKey= 1" in a sub-query to ensure it's executed before A is joined to B. It's something like this :

SELECT ... 
FROM (SELECT ... FROM A WHERE a.primaryKey=1) AS qryA
   INNER JOIN B ...
WHERE D.d = 2;
Mathieu Pagé
  • 10,764
  • 13
  • 48
  • 71