1

what are the limits of data i can pass to a database in a programing language(like php). suppose i have 1 million records in my database and I have 1 million data in my hand which i want to do a exist checking. if i used a query like

 select id from table where id in (array of 1 million data)

what will happen? will this request even reach database? if it reaches, what are the posibilities ,will it returns a data a better speed than a million querys to db searching id's or a full select data call with millions of for loops.

just for curiosity!.

jithink
  • 11
  • 3
  • we do something like that using .net, we bulk insert the values to a staging table, then do ....IN (SELECT ID FROM tab...) Not so many as 1 million though, but it could still work ok – Cato May 30 '19 at 13:36

1 Answers1

3

There isn't a specific number, however, the documentation specifies you'll likely to have problems once you have "thousands" of values. IN (Transact-SQL) - Remarks:

Explicitly including an extremely large number of values (many thousands of values separated by commas) within the parentheses, in an IN clause can consume resources and return errors 8623 or 8632. To work around this problem, store the items in the IN list in a table, and use a SELECT subquery within an IN clause.

Error 8623:

The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

Error 8632:

Internal error: An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them.

To quote my comment I made:

If you need to pass a large number of values to a query, I suggest a Table-Type parameter. But if you really need to pass 1M+ values then it sounds like something is wrong with your design. You may even be better off listing the values you don't want.

Edit: To add to my comment, many (including myself) prefer to use EXISTS instead of IN. So instead of a query like:

FROM YourTable YT
WHERE YT.YourColumn IN (SELECT OT.YourColumn
                        FROM OtherTable OT)

You would have the query:

FROM YourTable YT
WHERE EXISTS (SELECT 1
              FROM OtherTable OT
              WHERE OT.YourColumn = YT.YourColumn)
Thom A
  • 88,727
  • 11
  • 45
  • 75
  • what about the second question which is the best approach to check a 1m data to another 1 m data in db – jithink May 30 '19 at 13:37
  • 1
    The answer covers that @jithink, the documentation states *"store the items in the IN list in a table, and use a SELECT subquery within an IN clause."* and I state *"If you need to pass a large number of values to a query, I suggest a Table-Type parameter"*. Personally, i also prefer using `EXISTS` over `IN`. – Thom A May 30 '19 at 13:40
  • @jithink I'd used `EXISTS`. See these posts [1](https://stackoverflow.com/questions/2065329/sql-server-in-vs-exists-performance), [2](https://sqlinthewild.co.za/index.php/2009/08/17/exists-vs-in/), [3](https://dba.stackexchange.com/questions/121034/best-practice-between-using-left-join-or-not-exists) for some interesting debate on performance and comparisons. – S3S May 30 '19 at 13:42