0

I have a HUGE query which I need to optimize. Before my coding it was like

SELECT [...] WHERE foo = 'var' [...]

executed 2000 times for 2000 different values of foo. We all know how slow it is. I managed to join all that different queries in

SELECT [...] WHERE foo = 'var' OR foo = 'var2' OR [...]

Of course, there are 2000 chained comparisons. The result is a huge query, executed a few seconds faster than before but not enough. I suppose the StringBuilder I am using takes a while in building the query, so the time earned by saving 1999 queries is wasted in this:

StringBuilder query = new StringBuilder();
foreach (string var in vars)
   query.Append("foo = '").Append(var).Append("' OR ");
query.Remove(query.Length - 4) // for removing the last " OR "

So I would like to know if I could use some workaround for optimize the building of that string, maybe joining different values in the comparison with some SQL trick like

SELECT [...] WHERE foo = ('var' OR 'var2' OR [...])

so I can save some Append operations. Of course, any different idea trying to avoid that huge query at all will be more than welcome.

Armaggedon
  • 399
  • 4
  • 14
  • 1
    Check the IN operator: `SELECT [...] WHERE foo IN ('var', 'var2', ...)` – Mikhail Sep 30 '13 at 13:13
  • 3
    from where does the list of var's come? – Dan Bracuk Sep 30 '13 at 13:13
  • 1
    @Mikhail, oracle has a 1000 list item limit. – Dan Bracuk Sep 30 '13 at 13:13
  • @DanBracuk Well, there is a workaround for it, but, i guess, it will neglect any performance improve. Armaggedon, have you tried parametrizing this query, not hardcoding the values into string? Considering, it runs 2000 times, caching it will improve performance greatly. – Mikhail Sep 30 '13 at 13:25
  • @Mikhail I've tried the `IN` operator with sets of less than 1000 values and there is no improvement compared with the `=` operator. Also, the query is already parameterized, all except this. I will try it, but I'm not sure if it's going to improve anything. – Armaggedon Sep 30 '13 at 13:46
  • @DanBracuk `vars` comes from an Active Directory database, nothing to do with this, so I cannot join the queries. It's not under my control, so I can only query it. – Armaggedon Sep 30 '13 at 13:49
  • @Armaggedon, did you check the execusion plan ? are there any indexes on FOO column ? are they used ? – A.B.Cade Sep 30 '13 at 14:26
  • @A.B.Cade No indexes and no possibility of create them :( – Armaggedon Sep 30 '13 at 14:30
  • If it were me, I'd be more worried about limits on list item limits and sql command length limits than performance. No matter what you do, it's going to be slow. – Dan Bracuk Sep 30 '13 at 14:58
  • SQL Injection vuln on your string builder approach... – Martin Milan Sep 30 '13 at 15:37
  • @MartinMilan No chance of SQL Injection here because of the nature of the application. But thanks for the concern anyway :) – Armaggedon Oct 01 '13 at 07:25

3 Answers3

1

@Armaggedon,

For any decent DBMS, the IN () operator should correspond to a number of x OR y corresponding comparisons. About your concern about StringBuild.Append, its implementation is very efficient and you shouldn't notice any delay regarding this amount of data, if you have a few MB to spare for its temporary internal buffer. That said, I don't think your performance problem is related to these issues.

For database tuning it's always a far shot to propose solutions without the "full picture", but I think your problem might be related to compiling such a huge dynamic SQL statement. -- parsing and optimizing SQL statements can consume lots of processor time and it should be avoided.

Maybe you could improve the response time by moving your domain into an auxiliary indexed table. Or by moving the various checks over the same char column to a text search using INSTR functions:

-- 1. using domain table
SELECT myColumn FROM myTable WHERE foo IN (SELECT myValue FROM myDomain);

-- 2. using INSTR function
SELECT myColumn FROM myTable WHERE INSTR('allValues', foo, 1, 1) > 0;
Gerardo Lima
  • 6,467
  • 3
  • 31
  • 47
0

Why not use the IN-operator as of IN-operator on W3school? It lets you combine your values in a much shorter way. You can also store the values in a temporary table as mentioned in this post to bypass the limit of 1000 rows on Oracle

Community
  • 1
  • 1
MakePeaceGreatAgain
  • 35,491
  • 6
  • 60
  • 111
  • Good advice, but I'm afraid the query building is so complex that I cannot change it enough to make it work. It's not my code and I'm having a lot of trouble with it. – Armaggedon Sep 30 '13 at 13:52
  • Please beware of [www.w3schools.com]. Its name is deceiving, in that it suggests this organization is related to W3C and *it is not*. moreover, it contains many plain *wrong information*. For information about these problems, please refer to [www.w3fools.com]. – Gerardo Lima Sep 30 '13 at 16:01
  • 1
    But there´s no reason do down-vote this answer as its still a right solution for the problem, isn´t it? – MakePeaceGreatAgain Oct 01 '13 at 05:53
0

It's been a while since I danced the Oracle dance, but I seem to remember a concept of "Bind Variables" - typically used for bulk insertions... I'm wondering if you could express the list of values as an array, and use that with IN...

Have to say - this is just an idea - I don't have time to research it further for you...

Martin Milan
  • 6,346
  • 2
  • 32
  • 44
  • I think you are refering to parameterized queries (instead of hardcoding the values) as @Mikhail suggested. I'm going to try that. – Armaggedon Oct 01 '13 at 07:29
  • It is a form of parameterised query - the point being that the parameter is actually an array used to represent the list... – Martin Milan Oct 01 '13 at 08:08
  • I finally tried this (using [this](http://www.dforge.net/2012/11/05/oracle-select-where-col-in-bind_variable_array/) as example) with no results: the query keeps running forever until the browser gives up and I have to reboot the Application Pool. Good idea, though. But the `IN` clause is smarter than us :P – Armaggedon Oct 01 '13 at 13:25
  • OK... Temporary table approach then - create a temporary table with the values, and then use the Exists function... – Martin Milan Oct 01 '13 at 14:25