Is EXISTS more efficient than COUNT(*)>0?

Question

I'm using MySQL 5.1, and I have a query that's roughly of the form:

select count(*) from mytable where a = "foo" and b = "bar";

In my program, the only thing that it checks is whether this is zero or nonzero. If I convert this into:

select exists(select * from mytable where a = "foo" and b = "bar");

is MySQL smart enough to stop searching when it hits the first one? Or is there some other way to communicate to MySQL that my intent is simply to find out if any records match this, and I don't need an exact count?

Try explain on both queries, you should get the answer. You can post the explain outputs and ppl will help you decode. — Zimbabao, Mar 10 '11 at 19:19
It's extremely unlikely that MySQL will know to optimize `COUNT(*)>0`! — Gabe, Mar 10 '11 at 19:19
The ANSI standard says EXISTS is better because it shouldn't traverse or evaluare beyond "existence" of a row http://stackoverflow.com/questions/3271455/whats-the-best-to-check-if-item-exist-or-not-select-countidor-exist/3271464#3271464 — gbn, Mar 10 '11 at 19:21
Zimbabao: The EXPLAIN for both queries is identical, except for the addition of a "No tables used" entry for the latter case. — Ken, Mar 10 '11 at 19:38
@Ken - "No tables are used" because the `EXPLAIN` (in old versions) evaluates uncorrelated subqueries as part of figuring out the query plan. — Rick James, Dec 14 '18 at 19:50
@TomášZato - `EXISTS` returns 0 (for false) or 1 (for true). Then the outer `SELECT` returns a single row with a single column containing 0 or 1. — Rick James, Dec 14 '18 at 19:53

Thomas · Accepted Answer · 2017-04-12T15:38:27.970

49

Yes, MySQL (indeed all database systems as far as I'm aware) will stop processing when a row is returned when using an Exists function.

You can read more at the MySQL documentation: If a subquery returns any rows at all, EXISTS subquery is TRUE.

edited Apr 12 '17 at 15:38

answered Mar 10 '11 at 19:19

Thomas

63,911
12
95
141

I'm not sure SQLite does, along with other simple databases. – tc. Mar 10 '11 at 19:21
1

@tc - Yep. Even SQLLite does. Even Access. I've yet to run across a database that did not. – Thomas Mar 10 '11 at 19:22
Any proof or reference? – itsazzad Dec 21 '16 at 13:30
1

@itsazzad - [If a subquery returns any rows at all, EXISTS subquery is TRUE](https://dev.mysql.com/doc/refman/5.7/en/exists-and-not-exists-subqueries.html). In order to validate whether MySQL will determine how many rows by evaluating all rows, one need only use an explain plan with sufficiently large dataset to verify. – Thomas Dec 22 '16 at 14:56
@Thomas would be better if you add the reference to your aswer itself – Назар Топольський Apr 11 '17 at 14:38
1

@НазарТопольський - Done. – Thomas Apr 12 '17 at 15:38

score 16 · Answer 2 · answered Mar 10 '11 at 19:25

16

I have run a test with 1000 queries. SELECT EXISTS was about 25% faster than SELECT COUNT. Adding limit 1 to SELECT COUNT did not make any difference.

answered Mar 10 '11 at 19:25

linepogl

9,147
4
34
45

30

Adding `limit 1` to `select count` wouldn't make any difference, because `select count` returns a single row. :-) – Ken Mar 10 '11 at 19:27
The 25% is just this one case. It could be 0% faster (eg, the entire table needed to be scanned due to lack of sufficient index), or it could be 99% faster (optimal index, and value that makes good use of it). – Rick James Dec 14 '18 at 19:55

score 4 · Answer 3 · answered Dec 09 '14 at 08:31

This might be an approach too.

select 1 from mytable where a = "foo" and b = "bar" limit 1;

This would not traverce all records that meet the where condition but rather return '1' after first 'hit'. The drawback is you need to check for result, because there might be empty record set in return.

score 4 · Answer 4 · answered Mar 10 '11 at 19:31

4

The most reliable way is probably LIMIT 1, but that's not the point.

Provided you have an index like CREATE INDEX mytable_index_a_b ON mytable (a,b), MySQL should be smart enough to return the count from the index and not touch any rows at all. The benefit of LIMIT 1 is probably negligible.

If you don't have an index on (a,b), then performance will be terrible. LIMIT 1 may make it significantly less terrible, but it'll still be terrible.

answered Mar 10 '11 at 19:31

tc.

33,468
5
78
96

Well, I've inherited a legacy codebase, and performance *is* terrible. :-) – Ken Mar 10 '11 at 19:41
@Ken: It might be worthwhile going through the database and adding a large pile of indexes – tc. Mar 12 '11 at 04:37
The obvious indexes are already present, and some of these are fairly big tables on which insertion performance is important, too. – Ken Mar 15 '11 at 20:54
But if a million rows have that (a,b), then the `COUNT(*)` necessarily scans a million rows in the index (there is no further optimization there) _and `LIMIT 1` does not take effect until after the `COUNT` is done_. Valyo's `SELECT 1` will stop after finding the first matching (a,b), as will `EXISTS()`. – Rick James Dec 14 '18 at 20:01

Kevin Peno · Answer 5 · 2011-03-10T19:37:01.673

2

I don't know how well this works for optimization, but it should work functionally the same as exists. Exception being that it will return no row if there is no match.

SELECT true from mytable where a = "foo" and b = "bar" LIMIT 1;

edited Mar 10 '11 at 19:37

answered Mar 10 '11 at 19:30

Kevin Peno

9,107
1
33
56

Wouldn't that potentially return multiple rows, each with a single value `true`? – tc. Mar 10 '11 at 19:32
Sorry I forgot the `limit 1` in my edit. Thanks for pointing that out! – Kevin Peno Mar 10 '11 at 19:33
1

`true` and `1` are synonymous. Keep in mind that you don't get `false` (`0`), but an empty resultset for the opposite case. – Rick James Dec 14 '18 at 20:03

Is EXISTS more efficient than COUNT(*)>0?

5 Answers5

Linked

Related