0

I have to execute below query through JDBC call

select primaryid from data where name in ("abc", adc", "anx");

Issue is inside in clause I have to pass 11000 strings. Can I use prepared statement here? Or any other solution any one can suggest. I dont want to execute the query for each record, as it is consuming time. I need to run this query in very less time.

I am reading the strings from an XML file using DOMParser. and I am using sql server db.

Clockwork-Muse
  • 12,806
  • 6
  • 31
  • 45
Aarav
  • 59
  • 2
  • 11
  • please explain the problem a bit more detailed. is it creating long query ? – Vilen Feb 28 '14 at 10:48
  • Where do you get the strings from? Another query or from somewhere outside the DB? If I remember correctly there's an limit to the number of parameters you can have in a prepared statement, around 2000 or so. – jpw Feb 28 '14 at 10:50
  • 2
    For oracle, the limit is 1000. For other databases, the limit is often the max length of a query. If this list is not obtainable from a DB query, you could insert it into a temp table and join on this temp table. – JB Nizet Feb 28 '14 at 10:51
  • possible duplicate of [PreparedStatement with list of parameters in a IN clause](http://stackoverflow.com/questions/3107044/preparedstatement-with-list-of-parameters-in-a-in-clause) – Luca Basso Ricci Feb 28 '14 at 11:09
  • @Vilen: query is very simple( written in question) but due to number of parameters inside in clause it gets very big. So I can not write select * from data where id in (?,?,?). Also, the number of values are not constant. It varies. – Aarav Feb 28 '14 at 11:17
  • @bellabx: It is not duplicate as that question is not dealing with such huge data. Considering the huge data(that also is variable), I need solution. – Aarav Feb 28 '14 at 11:23
  • @Aarav thanks for details so if there are so many strings wouldn't it be easier to keep all those strings in separate table and instead of name have name_fk as foreign key to those strings, finally in a where clause use another select that will check whether name in given strings table or not. – Vilen Feb 28 '14 at 11:36

4 Answers4

2

I'm just wondering why you would need to have a manual set of 11,000 items where you need to specify each item. It sounds like you need to bring the data into a staging table (surely it's not been selected from the UI..?), then join to that to get your desired resultset.

user3358344
  • 193
  • 1
  • 9
  • I need to process that 11K records data(varies day by day), so I need to find details from db for each of them. Earlier I was executing prepared statement for each record, but it takes a lot of time. So thought of getting all the values all together. I dont knw this will work or not – Aarav Feb 28 '14 at 11:20
  • Those 11,000 records must be coming from a data source, right? It sounds like you need to push this data into the database on a daily schedule, then perform the requisite join. – user3358344 Feb 28 '14 at 11:23
  • Yes I am getting data from many streams/units. This data I have to process,combine and create a output file for some other stream. – Aarav Feb 28 '14 at 11:25
  • I've done a few projects where data has needed to be aggregated from disparate data sources, by the sounds of it you still need to bring them into a single source into the database. It would be the most efficient way of processing the data once you have it. You can easily truncate the table on a daily basis if you don't need to keep it for historical purposes. – user3358344 Feb 28 '14 at 11:29
  • Sorry.. But I can not make any changes in db. Whatever I can do is through file. – Aarav Feb 28 '14 at 11:30
  • In that case you're stuck with some sort of batch processing of records in the file, perhaps a hundred items in your `IN` statement at a time as to avoid any limits. But a new temporary table would be best if you can do it. :) – user3358344 Feb 28 '14 at 11:35
  • Okay.. Can you plz elaborate the table part a little. This is what I have understood.I will create a temp table from java program and fill all the values in it(How?- this also will have same values problem). If its done than I will query joining my temp table with data table. right? – Aarav Feb 28 '14 at 11:38
  • You'll have to insert each row into the temp table one by one, but it's likely smaller than the other table so should be more efficient. You can then join the temp table to your main data table then delete the temp table when you're finished. – user3358344 Feb 28 '14 at 11:50
  • 1
    Yes, doing it to a temp table is the best route (especially at this data size). Fortunately, most RDBMSs have bulk data-loading operations, which is **much** better than row-at-a-time inserts. – Clockwork-Muse Feb 28 '14 at 12:47
  • I got the impression that there was no access to bulk load software, but yes, absolutely use it if it is available. – user3358344 Feb 28 '14 at 12:49
1

Using an IN clause with 11k literal values is a really bad idea - off the top of my head, I know one major RDBMS (Oracle) that doesn't support more than 1k values in the IN list.

What you can do instead:

  • create some kind of (temporary) table T_NAMES to hold your names; if your RDBMS doesn't support "real" (session-specific) temporary tables, you'll have to add some kind of session ID
  • fill this table with the names you're looking for
  • modify your query to use the temporary table instead of the IN list: select primaryid from data where name in (select name from T_NAMES where session_id = ?session_id) or (probably even better) select primaryid from data join t_names on data.name = t_names.name and t_names.session_id = ?session_id (here, ?session_id denotes the bind variable used to pass your session id)
Frank Schmitt
  • 30,195
  • 12
  • 73
  • 107
0

A prepared statement will need to know the number of arguments in advance - something along the lines of :

PreparedStatement stmt = conn.prepareStatement(
    "select id, name from users where id in (?, ?, ?)");
stmt.setInt(1);
stmt.setInt(2);
stmt.setInt(3);

11,000 is a large number of parameters. It may be easiest to use a 'batch' approach as described here (in summary - looping over your parameters, using a prepared statement each time)

Note - if your 11,000 strings are the result of an earlier database select, then the best approach is to write a stored procedure to do the whole calculation in the database (avoiding passing the 11,000 strings back and forth with your code)

Graham Griffiths
  • 2,196
  • 1
  • 12
  • 15
  • two options - either write the 11,000 values in to a table in the DB instead of to the XML file (as JaneD suggests) - or use a prepared statement in a loop, passing in say 500 parameters at a time, as in my answer. Hers is a better solution, but you may not have control over the process of creating the XML file. – Graham Griffiths Feb 28 '14 at 11:29
  • So, I have to write '?' 500 times. Is that the only possible way.:-( – Aarav Feb 28 '14 at 11:31
  • use dynamic sql instead of prepared statement, but still in batches? So, you still have a loop to put 500 parameters into a single query, but instead of a prepared statement you just build up a string "WHERE IN (....)". It will be slower and less type safe...but cleaner code. – Graham Griffiths Feb 28 '14 at 12:04
0

You can merge all your parameter strings into one bitg string separating by ';' char

bigStrParameter=";abc;adc;anx;"

And use LOCATE to find substring.

select primaryid from data where LOCATE(concat(';',name,';'),?)>=0;
StanislavL
  • 56,971
  • 9
  • 68
  • 98
  • How this query is working can you please explain.. I liked your answer. Hope this works for me. – Aarav Feb 28 '14 at 12:42
  • It tries to find substring in the big string. If the name is part of the big merged string the chanr index returned by LOCATE >0 – StanislavL Feb 28 '14 at 13:06
  • is name =bigstrparameter?. Locate string will return index of name in '?' ? I am totally confused with your sql. – Aarav Mar 01 '14 at 05:29