210

As the title suggests... I'm trying to figure out the fastest way with the least overhead to determine if a record exists in a table or not.

Sample query:

SELECT COUNT(*) FROM products WHERE products.id = ?;

    vs

SELECT COUNT(products.id) FROM products WHERE products.id = ?;

    vs

SELECT products.id FROM products WHERE products.id = ?;

Say the ? is swapped with 'TB100'... both the first and second queries will return the exact same result (say... 1 for this conversation). The last query will return 'TB100' as expected, or nothing if the id is not present in the table.

The purpose is to figure out if the id is in the table or not. If not, the program will next insert the record, if it is, the program will skip it or perform an UPDATE query based on other program logic outside the scope of this question.

Which is faster and has less overhead? (This will be repeated tens of thousands of times per program run, and will be run many times a day).

(Running this query against M$ SQL Server from Java via the M$ provided JDBC driver)

SnakeDoc
  • 13,611
  • 17
  • 65
  • 97

16 Answers16

267

EXISTS (or NOT EXISTS) is specially designed for checking if something exists and therefore should be (and is) the best option. It will halt on the first row that matches so it does not require a TOP clause and it does not actually select any data so there is no overhead in size of columns. You can safely use SELECT * here - no different than SELECT 1, SELECT NULL or SELECT AnyColumn... (you can even use an invalid expression like SELECT 1/0 and it will not break).

IF EXISTS (SELECT * FROM Products WHERE id = ?)
BEGIN
--do what you need if exists
END
ELSE
BEGIN
--do what needs to be done if not
END
Nenad Zivkovic
  • 18,221
  • 6
  • 42
  • 55
  • doesn't this have to first execute the SELECT statement, then execute the IF EXISTS statement... causing additional overhead and therefore more processing time? – SnakeDoc Aug 08 '13 at 03:37
  • 9
    @SnakeDoc No. `Exists` works with `select` in such a fashion that it exits as soon as one row is found. Furthermore exists merely notes the existence of record, not actual values in the record, saving the need to load the row from disk (assuming search criteria is indexed, of course). As for overhead of `if` - you will have to spend this minuscule time anyway. – Nikola Markovinović Aug 08 '13 at 09:14
  • 1
    @NikolaMarkovinović interesting point. I'm not sure if an Index exists on this field, and my newbish SQL doesn't know how to find out. I am working with this DB from Java via JDBC and the database is remotely located in a colo somewhere. I've only been provided a "database summary" which just details which fields exist in each table, their type, and any FK or PK's. Does this change anything? – SnakeDoc Aug 08 '13 at 14:51
  • 4
    @SnakeDoc To find out about table structure, including foreign keys and indexes, run [sp_help](http://technet.microsoft.com/en-us/library/ms187335.aspx) *table_name*. Indexes are essential when it comes to retrieving a few rows out of many, wherther using `select top` or `exists`; if they are not present sql engine will have to perform table scan. This is the least desirable table search option. If you are not authorized to create indexes you will have to communicate to technical staff on the other side to find out whether they adjust them automatically or they expect you to suggest indexes. – Nikola Markovinović Aug 08 '13 at 23:34
  • +1 for taking the time to explain this to me. Thank you. I'm going to contact them to see if indexes exist and/or if they can be created for this. – SnakeDoc Aug 09 '13 at 00:57
  • Since the row in `Products` could be deleted or have its id changed, would the whole code snippet in the answer be wrapped in a transaction as well? Thanks! – paulkon Nov 04 '14 at 08:48
  • How this can be implemented in c# code? I would like to create a method to check if a record exists. the method should return true or false. Do I need to say if exists SELECT 1 and else if SELECT 0 then use ExecuteScalar to get the value? – akd Jul 01 '16 at 09:47
  • @akd This is not for C#, this is for TSQL. – Karl Gjertsen Jul 07 '17 at 08:28
  • @akd In EF (or rather LINQ to whatever), you can use the .Any() extension method to get EXISTS functionality. – Brandon Oct 13 '17 at 18:26
  • 1
    Is it possible to directly return the result of EXISTS? Like `select exists() ;` – Konstantin Feb 28 '18 at 17:09
  • 6
    @Konstantin You can do something like `SELECT CASE WHEN EXISTS(..) THEN 1 ELSE 0 END;` – Nenad Zivkovic Feb 28 '18 at 17:28
210

SELECT TOP 1 products.id FROM products WHERE products.id = ?; will outperform all of your suggestions as it will terminate execution after it finds the first record.

gonsalu
  • 3,154
  • 2
  • 18
  • 19
Declan_K
  • 6,726
  • 2
  • 19
  • 30
  • 6
    Doesn't optimizer take it into account itself when searches through PK (or any other unique key)? – zerkms Aug 07 '13 at 21:56
  • 1
    In my case `products.id` is not a PK... it's just a normal field. – SnakeDoc Aug 07 '13 at 21:57
  • 5
    He nver stated that is was the PK, but if so then yes the optimizer would take that into account. – Declan_K Aug 07 '13 at 21:58
  • 5
    @Declan_K: seems like my magic sphere failed in this case and a column entitled as `id` isn't PK. So +1 to your advice. – zerkms Aug 07 '13 at 22:00
  • @zerkms lol go figure... this isn't my created db, but a provided one I have to work with... `id` field is the actual SKU of the product... and `catalogid` is the PK which is just a counter. – SnakeDoc Aug 07 '13 at 22:01
  • 6
    If it is not the PK, I would also suggest making sure there is an index on that column. Otherwise, the query will have to do a table scan instead of a faster table seek. – CD Jorgensen Aug 07 '13 at 22:23
  • wouldn't it be better to do `select top 1 t From products where products.id = ?;`? then it doesnt have to read any value (so if it's looking at a nonclustered index, it won't have to hop back to the PK if it's returning a PK value, etc.)? that's what I typically see and use – tommy_o Aug 07 '13 at 23:31
  • 1
    It is worth noting that for example SQLITE database accept "Limit X" syntax, so it should be `SELECT products.id FROM products WHERE products.id = ? LIMIT 1` – Paweł Brewczynski Mar 27 '14 at 09:08
  • 6
    I think we should consider @nenad-zivkovic answer over this one. – Giulio Caccin Jul 07 '15 at 23:06
30

Nothing can beat -

SELECT TOP 1 1 FROM products WHERE id = 'some value';

You don't need to count to know if there is a data in table. And don't use alias when not necessary.

AgentSQL
  • 2,810
  • 20
  • 22
  • 5
    In spite of its name `id` is not primary key. So, even though you are not *counting* you still need to find all matching records, possibly thousands of them. About aliasing - code is constant work in progress. You never know when you'll have to go back. Aliasing helps preventing stupid runtime errors; for example, unique column name *that didn't need an alias* is not unique any more because somebody created a column of same name in another, joined table. – Nikola Markovinović Aug 07 '13 at 22:21
  • Yes, you are absolutely right. Aliasing helps a lot but i don't think it makes any difference when not using joins. So, I said don't use it if not necessary. :) And you can find a long discussion [here](http://blog.sqlauthority.com/2010/02/21/sql-server-if-existsselect-null-from-table-vs-if-existsselect-1-from-table/) on checking existence. :) – AgentSQL Aug 07 '13 at 22:34
  • 3
    I don't know why I acccepted the term `aliasing`. Correct term is `qualifying`. Here is [longer explanation by Alex Kuznetzov](http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/10/25/defensive-database-programming-qualifying-column-names.aspx). About single table queries - it is single table *now*. But later, when bug is discovered and you are trying to hold the flood, client is nervous, you join another table just to face error message - easily correctable message, but not at this sweaty moment, a small stroke strikes - and you correct the error remembering never to leave a column ... – Nikola Markovinović Aug 07 '13 at 22:47
  • 1
    Can't ignore that now. Thanks!! :) – AgentSQL Aug 07 '13 at 23:10
17
SELECT CASE WHEN EXISTS (SELECT TOP 1 *
                         FROM dbo.[YourTable] 
                         WHERE [YourColumn] = [YourValue]) 
            THEN CAST (1 AS BIT) 
            ELSE CAST (0 AS BIT) END

This approach returns a boolean for you.

Ram
  • 3,092
  • 10
  • 40
  • 56
Kris Coleman
  • 416
  • 3
  • 11
  • 1
    Can probably omit the Top statement and the * statement to make it a bit more faster, as Exist will exit once it finds a record, so something like this: SELECT CASE WHEN EXISTS (SELECT 1 FROM dbo.[YourTable] WHERE [YourColumn] = [YourValue]) THEN CAST (1 AS BIT) ELSE CAST (0 AS BIT) END – Stefan Zvonar Jul 04 '17 at 01:58
  • This suggestion fails to mention why this would be faster over the built-in exists / not exists statements within SQL Server. Without any benchmarking i'd be hard-pressed to believe that a case statement would yeild a faster result than an immediate true/false response. – Bonez024 Apr 09 '19 at 16:10
12

Don't think anyone has mentioned it yet, but if you are sure the data won't change underneath you, you may want to also apply the NoLock hint to ensure it is not blocked when reading.

SELECT CASE WHEN EXISTS (SELECT 1 
                     FROM dbo.[YourTable] WITH (NOLOCK)
                     WHERE [YourColumn] = [YourValue]) 
        THEN CAST (1 AS BIT) 
        ELSE CAST (0 AS BIT) END
Stefan Zvonar
  • 3,959
  • 3
  • 24
  • 30
11

You can also use

 If EXISTS (SELECT 1 FROM dbo.T1 WHERE T1.Name='Scot')
    BEGIN
         --<Do something>
    END 

ELSE    
     BEGIN
       --<Do something>
     END
Mohammad Atiour Islam
  • 5,380
  • 3
  • 43
  • 48
6

Below is the simplest and fastest way to determine if a record exists in database or not Good thing is it works in all Relational DB's

SELECT distinct 1 products.id FROM products WHERE products.id = ?;
manish Prasad
  • 636
  • 6
  • 16
2
SELECT COUNT(*) FROM products WHERE products.id = ?;

This is the cross relational database solution that works in all databases.

rogue lad
  • 2,413
  • 2
  • 29
  • 32
2

For those stumbling upon this from MySQL or Oracle background - MySQL supports the LIMIT clause to select a limited number of records, while Oracle uses ROWNUM.

Werner
  • 449
  • 4
  • 12
1

For MySql you can use LIMIT like below (Example shows in PHP)

  $sql = "SELECT column_name FROM table_name WHERE column_name = 'your_value' LIMIT 1";
  $result = $conn->query($sql);
  if ($result -> num_rows > 0) {
      echo "Value exists" ;
  } else {
      echo "Value not found";
  }
Muhammed Fasil
  • 7,909
  • 2
  • 19
  • 28
0
create or replace procedure ex(j in number) as
i number;
begin
select id into i from student where id=j;
if i is not null then
dbms_output.put_line('exists');
end if;
exception
   when no_data_found then
        dbms_output.put_line(i||' does not exists');

end;
code-jaff
  • 9,230
  • 4
  • 35
  • 56
kiran
  • 1
  • 2
    Possibly your code works great, but it would be better if you add some additional information so that is better understandable. – idmean May 25 '14 at 14:39
0

I've used this in the past and it doesn't require a full table scan to see if something exists. It's super fast...

UPDATE TableName SET column=value WHERE column=value
IF @@ROWCOUNT=0
BEGIN
     --Do work
END             
Barett
  • 5,826
  • 6
  • 51
  • 55
0

SQL SERVER 2012+

SELECT IIF((SELECT TOP 1 1 FROM dbo.[YourTable] WHERE [YourColumn] = [YourValue]) IS NULL, 0, 1)
Mohammad Dayyan
  • 21,578
  • 41
  • 164
  • 232
0

May you wanna try my way:

IF (SELECT TOP 1 1 FROM [TableName]) = 1
BEGIN
     --Do work
END 
GinCanhViet
  • 353
  • 5
  • 13
0
Select count(*) as RecordsExists from 
( SELECT TOP 1 * 
  FROM 
  Your-Table-Name 
  WHERE Column-Name-A='Y' or Column-Name-A ='N'
) 
as x

This Statement will get you the Records Count Based on Inner Statement whether the Records exists or not. It is divided into 2 parts

  1. The Inner Nested Statement will send you the Records whether it exists or not
  2. The Outer Statement will give you the count based on the Records provided by the inner Statement.
  • If the inner statement gives you "No Records" then the Outer Statement will give you "0"
  • If the inner statement gives you "1 Record" then the Outer Statement will give you "1"
iLearn
  • 991
  • 1
  • 13
  • 27
0
SELECT IIF(EXISTS(SELECT 1 FROM products WITH (NOLOCK) WHERE products.id = ?), 1, 0);

This is the fastest way because:

  • It uses EXISTS rather than COUNT or SELECT TOP 1. This is the database native way of determining whether a row exists.
  • Uses NOLOCK to avoid contention.

Additionally, this is more ergonomic than a CASE expression.

Metalogic
  • 498
  • 6
  • 16