0

The following query performs very poorly, due to the "order by". My goal is to get only a small subset of the resultset (using ROWNUM, for example). However, when I add "order by" it goes through the entire resultset performing an index lookup for each record, which makes it extremely slow. Without sorting the query is about 100 times faster when I limit the resultset to, for example, 1000 records.

QUERY:

SELECT text_field 
 from mytable where 
 contains(text_field,'ABC', 1)>0 
 order by another_field;

THIS IS HOW I CREATED THE INDEX:

CREATE INDEX myindex ON mytable (text_field) INDEXTYPE IS ctxsys.context FILTER BY another_field

EXECUTION PLAN:

---------------------------------------------------------------
| Id  | Operation                    | Name                   |
---------------------------------------------------------------
|   0 | SELECT STATEMENT             |                        |
|   1 |  SORT ORDER BY               |                        |
|   2 |   TABLE ACCESS BY INDEX ROWID| MYTABLE                |
|*  3 |    DOMAIN INDEX              | MYINDEX                |
---------------------------------------------------------------

I also used CTXCAT instead of CONTEXT, and no improvement. I think the problem is, when I want the results sorted (only top 1000), it performs an index lookup for each record in the "entire" resultset. Is there a way to avoid that?

Thank you.

nec
  • 3
  • 1
  • Sorry, looks like there was a problem with formatting the execution plan portion in the text above. – nec Jan 25 '15 at 21:29
  • 1
    Is it 100 times faster to retrieve all the data, or just the first batch/page of results? Doesn't help you tune it with the ordering of course, but might help understand the difference. – Alex Poole Jan 25 '15 at 22:50
  • Just the first page of results. But, when I use ordering it goes through all results, even if I request just the first page, and that what makes it slow. So, I was trying to find a way to create the index, so that it does not have to go through the entire resultset to return the first page, when asking for sorted. – nec Jan 26 '15 at 14:49
  • Where is your query that restricts to 1000 rows? Are you applying the rownum check to the query you showed as a subquery? Please add that code and its explain plan to the question. It ought to get a 'count stopkey' step in the plan. – Alex Poole Jan 26 '15 at 15:21
  • I actually execute a "sql pass-through" from within SAS. I use "outobs" parameter. But, I think this works similar to ROWNUM. – nec Jan 26 '15 at 15:33
  • Sorry, don't know anything about that or how it works, but is SAS is doing the limiting after it's executed the query? (I think [this](http://stackoverflow.com/a/11747832/266304) suggests so?). Oracle can't optimise it if it doesn't know you're limiting it; unless you pass the rownum limit through explicitly I don't think you can do much. – Alex Poole Jan 26 '15 at 15:54
  • if I use ROWNUM in the where clause the ordering occurs "within" the resultset, i.e. first page. However, I want to ordering to take place considering all records (all pages), and just return the first page. – nec Jan 26 '15 at 16:46

1 Answers1

0

To have the ordering applied before the rownum filter, you need to use an in-line view:

SELECT text_file
from (
 SELECT text_field 
  from mytable where 
  contains(text_field,'ABC', 1)>0 
  order by another_field
)
where rownum <= 1000;

With your index in place Oracle should optimise this to do as little work as possible. You should see 'sort order by stopkey' and 'count stopkey' steps in the plan, which is Oracle being clever and knowing it only needs to get 1000 values from the index.

If you don't use the in-line view but just add the rownum to your original query it will still optimise it but as you state it will order the first 1000 random (or indeterminate, anyway) rows it finds, because of the sequence of operations it performs.

Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • Thanks Alex. Yes, I noticed "COUNT STOPKEY" in the plan, and I see some improvement (~20%), but this is still lot slower compared to the performance without the sort. I'm not sure how much Oracle was able to optimize the search, because it seems that it goes through the entire result set again. – nec Jan 26 '15 at 17:26
  • @nec - still a lot slower than just getting the first page of results, or the first 1000 results? Just to check you're comparing similar things. I haven't done much with Oracle text though so I'm not sure if the indexes work quite the same. You could also check your statistics are up to date I suppose, if this is a new index. – Alex Poole Jan 26 '15 at 17:34
  • Alex, actually after a few more runs it seems that the performance is "lot better", i.e. order of magnitude... I thought that was perhaps because of cache, and tried searching using different text, and still looks impressive. I will do some more testing, and post an update. Thanks !!! – nec Jan 26 '15 at 17:37
  • @AlexPoole You can undelete your answer from [here](http://stackoverflow.com/questions/28158144/using-a-case-statement-in-a-check-constraint/28158560?noredirect=1#comment44689281_28158560). Now OP's want to secure data and your solution is the best for it :). You will get my voteup . – Robert Jan 26 '15 at 21:33
  • @AlexPoole True but his answer is quite different :) – Robert Jan 26 '15 at 21:43