2

I have a table,

Contacts: Contact_ID (int) ContactName (nvarchar)

I am given a list of contact IDs to select from. Usually, i would just do

SELECT *
FROM Contacts
WHERE IN (List of contact ID)

The problem is, the list of contact IDs could potentially get very big, like 50k or more.

So my question is, Is there a way to deal with a large list of contact IDs without having to use the IN clause?

EDIT: I'm using Microsoft sql server. The query and the contact IDs are built during runtime and passed on to sqlCommand class (c#) to be executed.

Vibol
  • 615
  • 1
  • 8
  • 25
  • 3
    use a join between two tables and allow the INNER join to eliminate records... use a correlated subquery and the 'exists' clause (or similar depending on RDBMS) (Exists is ***generally*** fastest followed by inner join, followed by IN) so what RDBMS? and is the List of contact ID's in a table somewhere? (not sure what you mean by "Given a list of contacts" Insert the "List" into a temp table and use exists, or inner join... but that has overhead... what are you trying to achieve? Just improved readability? – xQbert Aug 28 '15 at 13:04
  • 4
    Which DBMS are you using? Every optimizer behaves differently –  Aug 28 '15 at 13:05
  • Not sure which DBMS you're on, but most will let you put those IDs in a temp table and join to the temp table. That will probably be faster. – Brian Driscoll Aug 28 '15 at 13:08
  • SQLServer automatically converts large IN clauses (>50 items or so) to TempTable + EXISTS. There is no performance benefit in changing. – adrianm Aug 28 '15 at 13:11
  • 1
    @xQbert seems like my question is kinda vague, sorry. I should clarify that the query is built during runtime, and so is the list of contact IDs. When i tried to run this query with a large in clause, it sql server gives me a "run out of internal resource" error. – Vibol Aug 28 '15 at 13:14
  • @adrianm . . . Do you have a reference for that behavior? – Gordon Linoff Aug 28 '15 at 13:32
  • Can you query for the contact IDs? If so, you could use "Select * from contacts where contactid in (select contactid from contactIDs) tbl1" .... or something similar – user3476534 Aug 28 '15 at 13:37
  • @GordonLinoff - Discussed in the comments here http://sqlblog.com/blogs/paul_white/archive/2011/02/16/when-is-a-seek-not-a-seek.aspx. – Martin Smith Aug 28 '15 at 13:39
  • @GordonLinoff, only looking at performance and query plan for different number of items in my own tests. – adrianm Aug 28 '15 at 13:52

3 Answers3

3

I'd create a table type with a single column and a clustered primary key.

CREATE TYPE dbo.ContactId AS TABLE
(
    ContactId INT NOT NULL PRIMARY KEY 
);

Pass the values into the query using a table valued parameter.

Change your query to

SELECT *
FROM Contacts
WHERE contactID  IN (SELECT y.contactID FROM @yourtablevaluedparameter y)
OPTION (RECOMPILE)

The OPTION (RECOMPILE) is there to get the number of rows taken into account as the optimal plan for 50K may well be different than for 1.

You can find some example C# code for populating a TVP here

Community
  • 1
  • 1
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • creating another table might be the way to go. I'm trying out this solution right now. Will let you know! thanks! – Vibol Aug 28 '15 at 13:38
0

If you want performance, I would use the EXISTS clause.

SELECT c.Contact_ID, c.ContactName
FROM Contacts c
WHERE EXISTS (List of contact ID)
0

Create a temp table and populate your contact id in the form of rows. Do an inner join between your table and temp table like the below.

SELECT c.*
FROM Contacts c
join #temptable t
on c.id=t.id

If you introduce index on the Join column in your temp table then your query will be more faster.

StackUser
  • 5,370
  • 2
  • 24
  • 44