0

I need to write a simple query using oracle, java and mybatis:

select * from FOO foo where foo.id IN (ids)

Now, ids is a large collection of strings, about 7000. Unfortunately, oracle has 1000 elements limit for IN clause.

To overcome this, I can either:

  1. concatenate the query dynamically so it becomes: select * from FOO foo where foo.id IN (chunk1) or foo.id IN (chunk2) ... I'm not sure if it even works and I really doubt it performs well
  2. use temporary table and rewrite the query to: select * from FOO foo join SOME_TEMPORARY_ID tempids on foo.id = tempids.id.

I've decided to go for 2 option. Before performing the query I need to somehow do an efficient batch insert to Oracle. Unfortunately, Oracle has proprietary syntax to do batch inserts:

INSERT ALL
INTO some_table VALUES ('foo')
INTO some_table VALUES ('foo1')
INTO some_table VALUES ('foo2')
....
INTO some_table VALUES ('foo12345')
SELECT * FROM DUAL

Now, I didn't mention but I want to write an integration test for this, ideally using H2 or any other inmemory database. Of course H2 doesn't support that syntax. Neither does HSQLDB.

Do you know any in-memory database that fully supports proprietary oracle syntax? Or at least this specific one INSERT ALL clause?

slnowak
  • 1,839
  • 3
  • 23
  • 37

3 Answers3

2

Thanks that you describe your primary problem. Check, may be this can help you

where (foo.id, 0) in (('1', 0), ('2', 0),...)

This simple workaround hasn't limitation. If this answer wouldn't be appropriate then let me know. I delete this answer and think again at your child problem.

Evgeniy K.
  • 1,137
  • 7
  • 11
  • What about the performance? What should be more performant - using temporary tables or 7k elements IN clause? – slnowak Jul 16 '16 at 17:18
  • @slnowak You note that ids is collection, Why simply do smt like this 'select * from FOO foo where foo.id IN (select * from table(collection ids))'. Where ids is nested table. You can pull ids in it and then run query. About perfomance I think nested table is the best, temporaty the second and in-condition the third. – Evgeniy K. Jul 16 '16 at 17:55
  • Could you explain the syntax and the feature itself a little bit more? I didn't know it exist. Is it more or less equivalent to creating temp table? And doesn't it suffer from that 1k IN clause limit? – slnowak Jul 16 '16 at 18:29
  • @slnowak About collection see link http://stackoverflow.com/questions/400255/how-to-put-more-than-1000-values-into-an-oracle-in-clause (answer tuinstoel). Also note that in his example he makes assumption that collection 'p_numbers' has already been populated. You may do it writting script or package which has one variable and method that adding element to collection. If you have 7k elemnts then you should write 7k times add element. I think it will be very long script. – Evgeniy K. Jul 17 '16 at 07:43
  • well, that looks even worse than UNION ALL approach (1 answer before the one you mentioned) or that INSERT ALL I mention in my first post. Anyway, thanks for your help. FYI I ended up using proprietary INSERT ALL syntax and spinning up docker with real oracle instance during my integration tests and testing against that. – slnowak Jul 17 '16 at 15:56
1

Ok, so I've decided that if I really want/have to use oracle proprietary features/syntax, then let's test it against live oracle.

So if you don't care about execution time and happen to have access do Docker on your CI server, then I recommend you these excellent libraries: https://mvnrepository.com/artifact/org.testcontainers (https://github.com/testcontainers/testcontainers-java), including oracle-xe module.

I was able to spin up oracle-xe container during my integration tests and test against live oracle instance.

slnowak
  • 1,839
  • 3
  • 23
  • 37
0

The limit of the length of a single INSERT statement (i.e. the # of records you can specify values for) is a hint that this operation is really best performed with multiple insert statements.

Can you iterate over a list of id values in a program, inserting one (or a few) at a time to your temp table, which you can then proceed to join on for your final query?

Of course, this eventually resembles your original option #1, but really it is a combination of #1 and #2 I guess. Anyway, it will work! :)

SlimsGhost
  • 2,849
  • 1
  • 10
  • 16
  • It's not about the limit of INSERT statement, it's rather about limit of IN clause. – slnowak Jul 16 '16 at 17:19
  • 1
    I'm suggesting to avoid having to use the `IN` clause by joining to the "temp" table containing all the id's. As the OP states, the challenge then becomes how to insert the id's into the temp table, which is what my answer is helping to address. You would still join to the "list of id's", but with a `JOIN` instead of `IN`. – SlimsGhost Jul 18 '16 at 18:04