126

Is there any way to get around the Oracle 10g limitation of 1000 items in a static IN clause? I have a comma delimited list of many of IDs that I want to use in an IN clause, Sometimes this list can exceed 1000 items, at which point Oracle throws an error. The query is similar to this...

select * from table1 where ID in (1,2,3,4,...,1001,1002,...)
Aaron Palmer
  • 8,912
  • 9
  • 48
  • 77
  • What kind of client do you have? .Net, Java ...? – tuinstoel Dec 30 '08 at 19:27
  • 2
    Have you tried the expansion? ie WHERE (ID=1 OR ID=2 ....) – TJR Sep 25 '12 at 01:48
  • Once upon a time I told that idea to a colleague. What a bad one: regularly are nightmare requests running on the database (more than 1000 OR operators generated by an application). A suggestion would be in such a solution to limit the list length... – bdulac Jun 30 '15 at 08:21
  • 20
    @Ben - curious why you marked my question as the duplicate rather than the question that was asked more than 2 years after mine. – Aaron Palmer Jun 30 '15 at 12:08
  • 2
    Goes to show how much of a wasteland this has become. – Otávio Décio Mar 10 '16 at 18:05

11 Answers11

110

Put the values in a temporary table and then do a select where id in (select id from temptable)

Otávio Décio
  • 73,752
  • 17
  • 161
  • 228
  • 7
    Personally I'd put the values into the temp table and use a JOIN to query the values. I don't know whether that's actually better performance or not, though. – Neil Barnwell Dec 30 '08 at 14:17
  • @Neil Barnwell - I think any decent SQL engine would optimize so that the IN and a JOIN would have pretty much the same performance. Using IN at least for me is clearer on its intent. – Otávio Décio Dec 30 '08 at 14:37
  • 23
    @ocdecio - my tests with Oracle 10g show different (and clearly worse) explain plans for the IN, compared to the JOIN. Personally I'd use the JOIN, and would recommend others to *test* different approaches to see differences in performance, rather than guess. – jimmyorr Aug 03 '09 at 19:59
  • @jimmyorr - thank you for taking the time to check the performance, although the OP didn't seem too preocupied with that. – Otávio Décio Aug 05 '09 at 17:54
  • The IN vs JOIN thing is generally due to the possibility of NULLS in the IN list. – WW. Oct 18 '10 at 12:09
  • 3
    But if he has 2000 values, how will he insert into temp table in a single DB hit? This is better to write some logic and split into 1000, 1000 records and create dynamic query as peter severin said select * from table1 where ID in (1,2,3,4,...,1000) or ID in (1001,1002,...,2000).. – Samurai Mar 05 '13 at 14:52
  • 3
    Use this technique to get good performance on bulk inserts to the temporary table: http://stackoverflow.com/questions/7195665/java-jdbc-multiple-prepared-statement-bulk-insert. This technique brought a 30 second long query to 1 second in my testing. – jmh May 22 '14 at 17:17
  • 11
    I can't create temp tables on production. Thumbs down – GabrielBB Oct 18 '16 at 12:48
  • 2
    In general, when you see a big in-list like this, its an indicator of a design flaw, like a missing entity. – BobC Jan 06 '17 at 05:56
  • Put the values in a temporary table: Can you give me an example for this? – Sara Mar 17 '21 at 23:53
  • You dont need to make temporary table. Just split the list into several IN (so any of the IN will have less then 1000 values) with OR between of them, – Alex Apr 09 '21 at 14:02
  • This will give you the temp table to use in your query: WITH IdsStrg AS ( SELECT @yourLargeINEpxressionContents str FROM dual ) , IdsList AS( SELECT trim(COLUMN_VALUE) strRow FROM IdsStrg, xmltable(('"' || REPLACE(str, ',', '","') || '"')) ) – Tiago Martins Sep 10 '21 at 15:19
  • @OtávioDécio Is performance better than having multiple IN clauses? – TriCore Nov 05 '21 at 21:50
  • @TriCore performance is king. At least in my world. – Otávio Décio Nov 12 '21 at 15:19
  • Can we use CTE's? and how to convert list of values into a temporary table? – FinThusiast Jan 31 '22 at 10:04
100
select column_X, ... from my_table
where ('magic', column_X ) in (
        ('magic', 1),
        ('magic', 2),
        ('magic', 3),
        ('magic', 4),
             ...
        ('magic', 99999)
    ) ...
Sergey11g
  • 1,207
  • 1
  • 9
  • 11
  • 4
    How come this works? – Matthew Moisen Sep 14 '16 at 18:44
  • 3
    Because Oracle is just so _ (shocking?) you won't believe it till you try it and see that it works! This beats creating a temporary table especially if you are good with vi/vim/subl. – LAK May 05 '17 at 03:56
  • 2
    I'm speechless. No error on the query itself, but I did get ORA-12609 after a self-imposed 5 minute recv_timeout limit. Is there a word for being shocked but not at all surprised at the same time? – JP Duffy Jul 19 '17 at 17:27
  • 2
    Wow! couldn't believe this would actually work but it does. – Anurag Nov 13 '17 at 07:31
  • 22
    This works because this is a multi-value comparison IN list. Oracle implemented this multi-value comparison IN list with a limit of < 100,000 rather than the 1,000 in the normal IN list. So the first value val1 is 'magic', this could have been a column as well. The second value val2 is a column. So the values in the IN list are the values in which val1 and val2 have to match. So val1 must equal input1, and val2 must equal input 2. Since the val1 and input1 are hardcoded to 'magic', then we can just treat this like a normal IN list, but with a limit of 100,000 rather than 1,000. – cody.tv.weber Mar 19 '18 at 19:44
  • This is good for a Production environment one-off for Production issues. – cody.tv.weber Mar 19 '18 at 19:58
  • a bit late, but when doing so, noticed the the explain plan COST increase linearly with the number of ('magic',:para1). does it degrade the performance? – Tim Mar 22 '19 at 22:13
  • 1
    Perhaps replace the `magic` string with a magic number to make the comparison easier? – AlikElzin-kilaka Apr 08 '19 at 15:58
  • Wonderful. In production environment we only have read-only access to the DB to troubleshoot the problem and not allow to create any temporary table. And this SQL help a lot to make the life easier. Thanks – Raymond Chiu Sep 27 '19 at 03:53
  • works seamlessly... thanks – Kannan Kandasamy Nov 17 '20 at 22:25
  • Note that I have tried over 100,000 values with Oracle Database 21c (21.3.0.0.0). and it worked. See https://satob.hatenablog.com/entry/2021/10/30/011205 for detail and you can try with https://gist.github.com/satob/ed801a03db2c35d5db362b9510de7be1 by yourself. – SATO Yusuke Oct 31 '21 at 08:53
  • Hi, I tried this method but I got the error : "Invalid number" ORA-01722 – Benoît Jul 20 '22 at 15:34
76

I am almost sure you can split values across multiple INs using OR:

select * from table1 where ID in (1,2,3,4,...,1000) or 
ID in (1001,1002,...,2000)
Peter Severin
  • 2,455
  • 21
  • 13
  • 12
    The max number of values in an IN clause is yet one of those limitations that you are never supposed to be limited by. – erikkallen Dec 31 '08 at 11:23
  • 3
    One can do that but it means Oracle sees a different query every time and that means a lot of hard parsing and that will slow things down. – tuinstoel Dec 31 '08 at 11:30
  • 1
    I like this solution, some of the other ones posted here are far too much effort! – TrojanName Nov 01 '12 at 09:51
  • But if he has 2000 values, how will he insert into temp table in a single DB hit? This is better to write some logic and split into 1000, 1000 records and create dynamic query.. – Samurai Mar 05 '13 at 14:50
  • There is a concept called temp table. We have the same kind of requirement. Where in we are a middle-ware having certain data, and customer queries with Say 1000+ ids at once. So for every request, we create a temp table, batch insert the list of ID's(We do it using JPA and not much code is needed to do this) in to that, then join these temp table ID's with the actual tables data. – napster Mar 10 '17 at 11:29
  • This is the correct answer. Just split it into multiple IN and its works without any temporary table. – Alex Apr 09 '21 at 14:01
  • Is performance better than creating temp table? – TriCore Nov 05 '21 at 21:49
  • This is the best answer. If you have other conditions, make sure you wrap all your clauses in parentheses. – dataman Mar 24 '23 at 20:44
53

You may try to use the following form:

select * from table1 where ID in (1,2,3,4,...,1000)
union all
select * from table1 where ID in (1001,1002,...)
Andreas Petersson
  • 16,248
  • 11
  • 59
  • 91
rics
  • 5,494
  • 5
  • 33
  • 42
  • 4
    When there is no privilege to create temporary table, this makes a super workaround.. – rofans91 Aug 12 '13 at 02:58
  • Not suitable when you don't know how many thousands of values ahead of time. e.g IF there are 3005 values, there will be 4 select clauses unioned together – emily Jun 14 '18 at 17:04
  • I am working on a console C#, this way is pretty good actually. I just need to count the # of values, take the thousand digit, and - 1, then add in the numbers of line dynamically ahead of time. – Rozen Jan 10 '19 at 22:07
  • 3
    This is absolutely dumb, if you have some kind of grouping operators it totally changes the meaning of the query. – Teejay Mar 09 '20 at 09:31
  • @Teejay, you can still use this method with grouping, just put everything into parenthesis and select and group from that result set. Example: select a, count(*) from ( select * from... union all select * from... union all select * from... ) group by a – Kevin McCabe Feb 18 '22 at 20:22
8

Where do you get the list of ids from in the first place? Since they are IDs in your database, did they come from some previous query?

When I have seen this in the past it has been because:-

  1. a reference table is missing and the correct way would be to add the new table, put an attribute on that table and join to it
  2. a list of ids is extracted from the database, and then used in a subsequent SQL statement (perhaps later or on another server or whatever). In this case, the answer is to never extract it from the database. Either store in a temporary table or just write one query.

I think there may be better ways to rework this code that just getting this SQL statement to work. If you provide more details you might get some ideas.

WW.
  • 23,793
  • 13
  • 94
  • 121
  • 1
    Excellent questions! I often use the array-technique I already posted but I use it when the user has hand picked multiple rows in a user interface data grid. However it is unlikely that a user picks >1000 rows by hand. – tuinstoel Dec 31 '08 at 10:52
  • 1
    In my case the IDs come from a separate database. I get 10k IDs from DB-A and need to get data from DB-B for each one. Currently doing a massive IN (1,2,..,1000) statement. But having erratic performance. I'm here looking for more performant alternatives :) – Basil May 14 '20 at 16:39
  • @Basil try inserting the ids into a temporary table, then join to get the result you need. – WW. May 14 '20 at 21:29
5

Use ...from table(... :

create or replace type numbertype
as object
(nr number(20,10) )
/ 

create or replace type number_table
as table of numbertype
/ 

create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
  open p_ref_result for
    select *
    from employees , (select /*+ cardinality(tab 10) */ tab.nr from table(p_numbers) tab) tbnrs 
    where id = tbnrs.nr; 
end; 
/ 

This is one of the rare cases where you need a hint, else Oracle will not use the index on column id. One of the advantages of this approach is that Oracle doesn't need to hard parse the query again and again. Using a temporary table is most of the times slower.

edit 1 simplified the procedure (thanks to jimmyorr) + example

create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
  open p_ref_result for
    select /*+ cardinality(tab 10) */ emp.*
    from  employees emp
    ,     table(p_numbers) tab
    where tab.nr = id;
end;
/

Example:

set serveroutput on 

create table employees ( id number(10),name varchar2(100));
insert into employees values (3,'Raymond');
insert into employees values (4,'Hans');
commit;

declare
  l_number number_table := number_table();
  l_sys_refcursor sys_refcursor;
  l_employee employees%rowtype;
begin
  l_number.extend;
  l_number(1) := numbertype(3);
  l_number.extend;
  l_number(2) := numbertype(4);
  tableselect(l_number, l_sys_refcursor);
  loop
    fetch l_sys_refcursor into l_employee;
    exit when l_sys_refcursor%notfound;
    dbms_output.put_line(l_employee.name);
  end loop;
  close l_sys_refcursor;
end;
/

This will output:

Raymond
Hans
tuinstoel
  • 7,248
  • 27
  • 27
4

I wound up here looking for a solution as well.

Depending on the high-end number of items you need to query against, and assuming your items are unique, you could split your query into batches queries of 1000 items, and combine the results on your end instead (pseudocode here):

//remove dupes
items = items.RemoveDuplicates();

//how to break the items into 1000 item batches        
batches = new batch list;
batch = new batch;
for (int i = 0; i < items.Count; i++)
{
    if (batch.Count == 1000)
    {
        batches.Add(batch);
        batch.Clear()
    }
    batch.Add(items[i]);
    if (i == items.Count - 1)
    {
        //add the final batch (it has < 1000 items).
        batches.Add(batch); 
    }
}

// now go query the db for each batch
results = new results;
foreach(batch in batches)
{
    results.Add(query(batch));
}

This may be a good trade-off in the scenario where you don't typically have over 1000 items - as having over 1000 items would be your "high end" edge-case scenario. For example, in the event that you have 1500 items, two queries of (1000, 500) wouldn't be so bad. This also assumes that each query isn't particularly expensive in of its own right.

This wouldn't be appropriate if your typical number of expected items got to be much larger - say, in the 100000 range - requiring 100 queries. If so, then you should probably look more seriously into using the global temporary tables solution provided above as the most "correct" solution. Furthermore, if your items are not unique, you would need to resolve duplicate results in your batches as well.

Marlon
  • 19,924
  • 12
  • 70
  • 101
Mike Atlas
  • 8,193
  • 4
  • 46
  • 62
  • 1
    He, my solution is more correct :) There is no need for a temporary table. – tuinstoel Aug 03 '09 at 19:49
  • correct thinking the code is a bit too boilerplate, though. we use Lists.partition() from google-collections to make this almost a one-liner – Andreas Petersson Aug 15 '09 at 09:23
  • I don't see any advantage. If you don't want to use an Oracle collection or a temp table, then use Peter Severin's solution. Peter Severin's solution results in less database calls and less parsing than this solution. Also easier in the client because you have only one batch. – tuinstoel Aug 15 '09 at 09:55
  • I wish I had a .Partition() method/function =) – Mike Atlas Aug 16 '09 at 15:27
2

Yes, very weird situation for oracle.

if you specify 2000 ids inside the IN clause, it will fail. this fails:

select ... 
where id in (1,2,....2000) 

but if you simply put the 2000 ids in another table (temp table for example), it will works below query:

select ... 
where id in (select userId 
             from temptable_with_2000_ids ) 

what you can do, actually could split the records into a lot of 1000 records and execute them group by group.

UdayKiran Pulipati
  • 6,579
  • 7
  • 67
  • 92
Aaron He
  • 51
  • 1
1

Here is some Perl code that tries to work around the limit by creating an inline view and then selecting from it. The statement text is compressed by using rows of twelve items each instead of selecting each item from DUAL individually, then uncompressed by unioning together all columns. UNION or UNION ALL in decompression should make no difference here as it all goes inside an IN which will impose uniqueness before joining against it anyway, but in the compression, UNION ALL is used to prevent a lot of unnecessary comparing. As the data I'm filtering on are all whole numbers, quoting is not an issue.

#
# generate the innards of an IN expression with more than a thousand items
#
use English '-no_match_vars';
sub big_IN_list{
    @_ < 13 and return join ', ',@_;
    my $padding_required = (12 - (@_ % 12)) % 12;  
    # get first dozen and make length of @_ an even multiple of 12
    my ($a,$b,$c,$d,$e,$f,$g,$h,$i,$j,$k,$l) = splice @_,0,12, ( ('NULL') x $padding_required );

    my @dozens; 
    local $LIST_SEPARATOR = ', '; # how to join elements within each dozen
    while(@_){
        push @dozens, "SELECT @{[ splice @_,0,12 ]} FROM DUAL"
    };  
    $LIST_SEPARATOR = "\n    union all\n    "; # how to join @dozens 
    return <<"EXP";
WITH t AS (
    select $a A, $b B, $c C, $d D, $e E, $f F, $g G, $h H, $i I, $j J, $k K, $l L FROM     DUAL
    union all
    @dozens
 )
select A from t union select B from t union select C from t union
select D from t union select E from t union select F from t union
select G from t union select H from t union select I from t union 
select J from t union select K from t union select L from t
EXP
}

One would use that like so:

my $bases_list_expr = big_IN_list(list_your_bases());
$dbh->do(<<"UPDATE");
    update bases_table set belong_to = 'us'
    where id in ($bases_list_expr)
UPDATE
Never Sleep Again
  • 1,331
  • 1
  • 9
  • 10
0

Instead of using IN clause, can you try using JOIN with the other table, which is fetching the id. that way we don't need to worry about limit. just a thought from my side.

GDP
  • 8,109
  • 6
  • 45
  • 82
Raju
  • 11
  • I can see that this was your first answer to a question, so you didn't have the rep to post this as a comment, but in the future, you might want to consider posting this as a comment. Answers are supposed to be a solution to the problem that you are pretty confident will solve the problem. – Fluffeh Sep 28 '12 at 10:31
  • That won't solve the problem. Joins will select all the records not choose a select item or two that the IN clause picks. – PHPGuru Nov 19 '14 at 02:18
-3

Instead of SELECT * FROM table1 WHERE ID IN (1,2,3,4,...,1000);

Use this :

SELECT * FROM table1 WHERE ID IN (SELECT rownum AS ID FROM dual connect BY level <= 1000);

*Note that you need to be sure the ID does not refer any other foreign IDS if this is a dependency. To ensure only existing ids are available then :

SELECT * FROM table1 WHERE ID IN (SELECT distinct(ID) FROM tablewhereidsareavailable);

Cheers

Simon Dugré
  • 17,980
  • 11
  • 57
  • 73
mousetwentytwo
  • 103
  • 1
  • 2