What is easier to read in EXISTS subqueries?

Question

It's a question of readability. There is no difference in performance.
_{Old versions of SQL Server were silly enough to look up meta data, but not any more.}

SELECT foo FROM bar WHERE EXISTS (SELECT * FROM baz WHERE baz.id = bar.id);
SELECT foo FROM bar WHERE EXISTS (SELECT 1 FROM baz WHERE baz.id = bar.id);

I am not considering NULL or "fun variants" which don't seem intuitive to me.

~~SELECT foo FROM bar WHERE EXISTS (SELECT NULL FROM baz WHERE baz.id = bar.id); SELECT foo FROM bar WHERE EXISTS (SELECT 1/0 FROM baz WHERE baz.id = bar.id);~~

The question popped up in comments just now. I researched the manuals of the most popular RDBMS:

MS SQL seems to favor SELECT * in the manual.
The example in the PostgreSQL 9.4 manual uses SELECT 1.
Oracle 11g has SELECT * in the language reference.
MySQL 5.7 has SELECT * in the reference manual but alsoSELECT 1 in the comments.
SQLite has no example in the language reference.

A search on SO for code:"EXISTS (SELECT 1" yields 5,048 results.
A search on SO for code:"EXISTS (SELECT *" yields 5,154 results.
_{Updated links and counts 07.2015.}

So SELECT * has the popular vote and the big commercial RDBMS on its side.
I find SELECT 1 more intuitive. It's like saying "if at least one exists".
Is SELECT * more intuitive?

Why would `SELECT 1` be like saying "if at least one exists"? I don't see that makes intuitive sense at all. If someone wrote `SELECT 2` would you intuitively think that was checking at least 2 exist? — Martin Smith, Oct 10 '11 at 08:29
@MartinSmith: you are using *intuition* and *thinking* in one sentence there. — Erwin Brandstetter, Oct 10 '11 at 08:35
And the problem with that is what? Intuition is a thought process. — Martin Smith, Oct 10 '11 at 08:36
@MartinSmith: I'd more closely relate intuition to perception, while I'd describe thinking as an active rational process. But that gets philosophical quickly. — Erwin Brandstetter, Oct 10 '11 at 08:47
But intuition leads to a thought or there is no point surely? — Martin Smith, Oct 10 '11 at 08:48
Maybe these might be better suited to discuss these non-development concepts? http://linguistics.stackexchange.com/ and http://philosophy.stackexchange.com/ — gbn, Oct 10 '11 at 08:52
@Erwin, I know you have already said you won't consider `select NULL`, but I would ask you to reconsider. The *only* time anyone would select null is where they don't care *what* is being returned - to me, it signifies that the only purpose of the query is to check for existence, and is therefore more intuitive than any other option. — , Oct 10 '11 at 09:08
@MarkBannister: The *only time* I would `select NULL`? That's just not true. Example (one of many): `INSERT INTO foo SELECT NULL FROM bar WHERE baz`; — Erwin Brandstetter, Oct 13 '11 at 23:25
@Erwin: why would you ever insert multiple null rows into a single-columned table? — , Oct 14 '11 at 09:05
@MarkBannister: For instance when foo has meaningful defaults and only the time of execution and the number of rows is relevant. — Erwin Brandstetter, Oct 14 '11 at 14:23
@Erwin: unless you were using auto-generated unique IDs, that would give you duplicate rows. Inserting a single count value would seem to make more sense. — , Oct 14 '11 at 15:20
@MarkBannister: No, it would *not* make more sense. The point is, there are valid uses of `SELECT NULL`. — Erwin Brandstetter, Oct 14 '11 at 15:38
@Erwin: no, this insert would violate first normal form. There may be other valid uses of `SELECT NULL`, but you haven't suggested any. — , Oct 14 '11 at 15:53
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/4268/discussion-between-mark-bannister-and-erwin-brandstetter) — , Oct 14 '11 at 15:53
@MarkBannister: not sure if you have seen my answer in the chat. It's gone by now. You are right about violating 1st NF. My example was sloppy. The example would have to be `INSERT INTO foo(val) SELECT NULL FROM bar WHERE baz;` for a table defined `TABLE foo(id serial, val text, ts timestamp default current_timestamp)`. In other words: insert a number of events with an unknown value, but the order of events and / or the timestamp are relevant. — Erwin Brandstetter, Nov 01 '11 at 18:10
Voted to re-open. @Erwin, I can see why you might want to insert one value at a time, to record timestamps, with a `select null`. I think this is another example of what I said in my original comment - "The only time anyone would select null is where they don't care what is being returned". Inside a sub-query, the only time you would do this (that I can think of) would be in an `exists` clause, which is why I think this is the most intuitive. — , Nov 01 '11 at 20:53
You want to know which is more intuitive, but the only place intuitiveness would add value here would be for real beginners. I have had beginners ask me before why I was doing "Select 1", but they seem to understand what's going on pretty intuitively with "Select *". So the fact that the one raises the question and the other doesn't leads me to believe * is more intuitive. — Brandon Moore, Nov 14 '11 at 23:59

score 10 · Answer 1 · answered Oct 10 '11 at 08:34

10

Intuitive is ...EXISTS (SELECT * .. because you really don't care

The only keyword of importance is EXISTS
The choice of ...EXISTS (SELECT 1 .. perpetuates the general myths and superstitions around EXISTS (eg comments on the MySQL docs).
ANSI standard says "doesn't matter"
It's more interesting to understand that EXISTS is a semi-join.

answered Oct 10 '11 at 08:34

gbn

422,506
82
585
676

What general myths? `SELECT 1` has my vote – Evan Carroll Mar 24 '17 at 02:49
@EvanCarroll why? – gbn Mar 24 '17 at 08:53
1

(a) the spec defines it as the same, (b) it's cleaner -- you clearly don't care about the columns, (c) the behavior is more consistent on different versions, some databases don't optimize the condition away yet. – Evan Carroll Mar 24 '17 at 15:45
6

Well, if you care for absolute "cleanness", `EXISTS (SELECT FROM a)` works fine in Postgres ;) – ypercubeᵀᴹ Mar 26 '17 at 16:57
2

I always use `EXISTS ( SELECT 'any rows?' FROM...` because it's in the spirit of the language to read like English. – Davos Nov 06 '17 at 05:51
@Davos That's quite a good idea and is completely ignored by the engine. – gbn Nov 06 '17 at 11:25
I thought so :) and it's not one of the "fun" (silly) ones like select 1/0 – Davos Nov 07 '17 at 01:34
@ErwinBrandstetter the question has been closed (since 2011 !!) – ypercubeᵀᴹ Oct 05 '18 at 17:46
@ypercubeᵀᴹ: Oh, right. :) I just linked to this one from a new question. And your comment is the answer I would finally accept. Sorry for the noise. (Shouldn't have been closed IMHO, btw.) – Erwin Brandstetter Oct 05 '18 at 21:29

score 5 · Answer 2 · edited Oct 10 '11 at 08:45

5

I still use EXISTS (SELECT * ...), for historical (gbn: should that be hysterical?) reasons. Technically, there is no difference, of course; the optimiser / planner will throw it away and reduce it to one bit of information. For the human reader the * looks more special, it will stand out as a special symbol, and not as a value or constant. Also, I tend to reduce the amount of literals and magic constants in my programs (eventually, only 0 and 1 should remain).

edited Oct 10 '11 at 08:45

gbn

422,506
82
585
676

answered Oct 10 '11 at 08:41

wildplasser

43,142
8
66
109

3

Do you mean "hysterical" or "historical"...? "hysterical" is more correct when talking about the myths and superstitions... – gbn Oct 10 '11 at 08:43
1

The pun was intended... – wildplasser Jan 03 '18 at 14:52

score 3 · Answer 3 · edited Jan 07 '13 at 19:48

3

In the context of EXISTS the SQL optimizer knows that it doesn't matter what it returns as long as it returns something. So to you it doesn't matter.

For the intuitive part: I don't think * will be right.

It's better to ask in words: "check whether even the slightest part exists" - meaning 1 (or something else).

edited Jan 07 '13 at 19:48

Erwin Brandstetter

605,456
145
1,078
1,228

answered Oct 10 '11 at 08:26

Royi Namir

144,742
138
468
792

2

The slightest part is a row here. That's what `EXISTS` checks, if a row exists. That's why `SELECT *` as `SELECT the-whole-row` looks more intuitive to many. – ypercubeᵀᴹ Oct 10 '11 at 08:59
Thanks. for correcting. I'm still having difficulties understanding @gbn's answer. if the inside loop does `select *` so this is being translated behind the scenes to columns - which takes more time. my solution returns only 1. I dont care about columns. Am I wrong ? ( lol and remove this 93 age of yours hhhhhh:-) – Royi Namir Jan 07 '13 at 19:54
Yes, even MySQL (not so clever) optimizer knows that `EXISTS (SELECT 1 ...)` and `EXISTS (SELECT * ...)` will either be true or false, depending on whether there is a row that **would be returned by** the subquery. In other words, optimizers know that they don't have to actually run the subquery and return something. They only check if it would return a row. Think of it as "the optimizers runs it always as `(SELECT 1 ...)`" when it is inside an `EXISTS`. – ypercubeᵀᴹ Jan 07 '13 at 20:01
@ypercube how can this be checked ? I mean something like refelector for sql...? the execution plan wont show me this conversion from `select *` or `select 1` ....this is the kind of things I dont like about SQL. – Royi Namir Jan 07 '13 at 20:04
This is different for each product. For MySQL for example, check this: [Tracing the Optimizer](http://dev.mysql.com/doc/internals/en/optimizer-tracing.html) – ypercubeᵀᴹ Jan 07 '13 at 20:10
PostgreSQL cares. That said, `SELECT 1` has my vote. – Evan Carroll Mar 24 '17 at 02:49
In fact, even `select null` would work in PostgreSQL, It is the existence of rows what `exists` measures – coterobarros Jul 18 '20 at 10:19

What is easier to read in EXISTS subqueries?

3 Answers3

Linked

Related