1

I am getting ORA-01795 error in my Java code while executing more than 1000 records in IN clause. I am thinking to break it in the batch of 1000 entries using multiple IN clause separated by OR clause like below:

  select * from table_name
  where
  column_name in (V1,V2,V3,...V1000)
  or
  column_name in (V1001,V1002,V1003,...V2000)

I have a string id's like -18435,16690,1719,1082,1026,100759... which gets generated dynamically based on user selection. How to write a logic for condition like 1-1000 records ,1001 to 2000 records etc in Java. Can anyone help me here?

Chiseled
  • 2,280
  • 8
  • 33
  • 59
user3161879
  • 103
  • 1
  • 17
  • You'd be much better off making a `PreparedStatement` accepting 1000 values, then executing that statement repeatedly with batches of 1000 values. – Marko Topolnik Nov 04 '14 at 22:13

4 Answers4

3

There are three potential ways around this limit:

1) As you have already mentioned: split up the statement in batches of 1000

2) Create a derived table using the values and then join them:

with id_list (id) as (
  select 'V1' from dual union all
  select 'V2' from dual union all
  select 'V3' from dual
)
select *
from the_table
where column_name in (select id from id_list);

alternatively you could also join those values - might even be faster:

with id_list (id) as (
  select 'V1' from dual union all
  select 'V2' from dual union all
  select 'V3' from dual
)
select t.*
from the_table t
  join id_list l on t.column_name = l.id;

This still generates a really, really huge statement, but doesn't have the limit of 1000 ids. I'm not sure how fast Oracle will parse this though.

3) Insert the values into a (global) temporary table and then use an IN clause (or a JOIN). This is probably going to be the fastest solution.

1

I very recently hit this wall myself:

Oracle has an architectural limit of a maximum number of 1000 terms inside an IN()

There are two workarounds:

  1. Refactor the query to become a join
  2. Leave the query as it is, but call it multiple times in a loop, each call using less than 1000 terms

Option 1 depends on the situation. If your list of values comes from a query, you can refactor to a join

Option 2 is also easy, but less performant:

List<String> terms;
for (int i = 0; i <= terms.size() / 1000; i++) {
    List<String> next1000 = terms.subList(i * 1000, Math.min((i + 1) * 1000, terms.size());
    // build and execute query using next1000 instead of terms
}
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • I have always used the second one. I wonder, though, whether the OR trick would also fly. – Marko Topolnik Nov 04 '14 at 22:17
  • @MarkoTopolnik yes I thibk it would work too (didn't think of that), but generally speaking OR usually results in indexes not being used, so performance may be a real problem. Also, the query itself would have to be modified by code (to add more IN()'s - more hassle). Finally, java8 parallel streams could execute each query in parallel, possibly completing in near similar time to one query. If the EXPLAIN plan looked OK, you could use OR - only testing would tell – Bohemian Nov 04 '14 at 22:33
  • Your mention of parallel streams is interesting... how would you ensure the 1000 cap on the size of batches assigned to a single thread, though? Also, does it really speed up things at the DB side if concurrent requests are made? – Marko Topolnik Nov 05 '14 at 08:07
  • @marko I haven't tried it yet, but I think [`Spliterator`](http://docs.oracle.com/javase/8/docs/api/java/util/Spliterator.html) allows parallel partitioned processing of an Iterable. Re simultaneous queries, you'd have to try it, but based on experience it should be faster and if not is unlikely to be slower. – Bohemian Nov 05 '14 at 13:01
  • The trouble with the Spliterator is that it splits your entire `ArrayList` down the middle, then again as necessary to saturate all CPUs. So you don't get to control the batch size. The authors are also strongly discouraging the use of the ForkJoinPool for tasks which are not computationally intensive and especially so for tasks which cause the threads to block on I/O. Manually splitting into batches and submitting to a classical Executor Service would be their advice. – Marko Topolnik Nov 05 '14 at 13:08
  • iam getting an exception when iam using the logic List terms; for (int i = 0; i <= terms.size() / 1000; i++) { List next1000 = terms.subList(i * 1000, Math.min(i + 1000, terms.size()); // build and execute query using next1000 instead of terms } Exception is : java.lang.IllegalArgumentException: fromIndex(2000) > toIndex(1002) – user3161879 Nov 05 '14 at 15:34
  • @user3161879 simple fix - see edited code. (I didn't test it - I just thumbed the code on my iPhone) – Bohemian Nov 05 '14 at 19:23
1

With so many values I'd avoid both in and or, and the hard-parse penalty of embedded values, in the query if at all possible. You can pass an SQL collection of values and use the table() collection expression as a table you can join your real table to.

This uses a hard-coded array of integers as an example, but you can populate that array from your user input instead. I'm using the built-in collection type definitions, like sys.odcinumberlist, which us a varray of numbers and is limited to 32k values, but you can define your own table type if you prefer or might need to handle more than that.

int[] ids = { -18435,16690,1719,1082,1026,100759 };
ArrayDescriptor aDesc = ArrayDescriptor.createDescriptor("SYS.ODCINUMBERLIST", conn );
oracle.sql.ARRAY ora_ids = new oracle.sql.ARRAY(aDesc, conn, ids);

sql = "select t.* "
    + "from table(?) a "
    + "left join table_name t "
    + "on t.column_name = a.column_value "
    + "order by id";
pStmt = (OraclePreparedStatement) conn.prepareStatement(sql);
pStmt.setArray(1, ora_ids);
rSet = (OracleResultSet) pStmt.executeQuery();
...

Your array can have as many values as you like (well, as many as the collection type you use and your JVM's memory can handle) and isn't subject to the in list's 1000-member limit.

Essentially table(?) ends up looking like a table containing all your values, and this is going to be easier and faster than populating a real or temporary table with all the values and joining to that.

Of course, don't really use t.*, list the columns you need; I'm assuming you used * to simolify the question...

(Here is a more complete example, but for a slightly different scenario.)

Community
  • 1
  • 1
Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • You can also use [`member of`](http://stackoverflow.com/a/16185941/266304) instead of joining to a table collection expression. The data types have to match for that to work, and it's probably worth testing the performance of both approaches - the size of the collection may makes a difference. – Alex Poole Dec 21 '16 at 17:37
0

In such situations, when I have ids in a List in Java, I use a utility class like this to split the list to partitions and generate the statement from those partitions:

public class ListUtils {
    public static <T> List<List<T>> partition(List<T> orig, int size) {
        if (orig == null) {
            throw new NullPointerException("The list to partition must not be null");
        }
        if (size < 1) {
            throw new IllegalArgumentException("The target partition size must be 1 or greater");
        }
        int origSize = orig.size();
        List<List<T>> result = new ArrayList<>(origSize / size + 1);
        for (int i = 0; i < origSize; i += size) {
            result.add(orig.subList(i, Math.min(i + size, origSize)));
        }
        return result;
    }
}

Let's say your ids are in a list called ids, you could get sublists of size at most 1000 with:

ListUtils.partition(ids, 1000)

Then you could iterate over the results to construct the final query string.

janos
  • 120,954
  • 29
  • 226
  • 236