1

Using SQL Server Management Studio, I am getting some undesired results (looks like a bug to me..?)

If I use (FIELD rather than field for the other_table):

SELECT * FROM main_table WHERE field IN (SELECT FIELD FROM other_table)

I get all results from main_table.

Using the correct case:

SELECT * FROM main_table WHERE field IN (SELECT field FROM other_table)

I get the expected results where field appears in other.

Running the subquery on it's own:

SELECT FIELD FROM other_table

I get an invalid column name error.

Surely I should get this error in the first case?

Is this related to collation? The DB is binary collation. The server is case insensitive however. It seems to me like the server component is saying "this code is OK" and not allowing the DB to say the field is the wrong name..?

What are my options for a solution?

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
Chris
  • 39
  • 2
  • 6
  • No this is correct behaviour for a case sensitive collation. `main_table` must have a column called `FIELD` so it is using that instead as it is the only column name of that casing in scope. You need to reference table/column names with the correct casing if your DB has a binary collation. Also better to use table prefixes so this kind of error is caught. `SELECT * FROM main_table WHERE field IN (SELECT o.field FROM other_table o)` – Martin Smith Jun 01 '12 at 11:47
  • 1
    possible duplicate of [sql server 2008 management studio not checking the syntax of my query](http://stackoverflow.com/questions/4594733/sql-server-2008-management-studio-not-checking-the-syntax-of-my-query) – Martin Smith Jun 01 '12 at 11:47
  • 2
    This is an example of why it is valuable to *always* specify which table as well as which field. See what happens if you change all occurances of `field` or `FIELD` to `other_table.field` or `main_table.field`, etc. – MatBailie Jun 01 '12 at 11:53
  • Thanks all. I've aliased the tables now and have no issue. I hadn't even considered that it was looking at FIELD in main_table. – Chris Jun 01 '12 at 13:10

2 Answers2

5

Let's illustrate what is happening using something that doesn't depend on case sensitivity:

USE tempdb;
GO

CREATE TABLE dbo.main_table(column1 INT);

CREATE TABLE dbo.other_table(column2 INT);

INSERT dbo.main_table SELECT 1 UNION ALL SELECT 2;
INSERT dbo.other_table SELECT 1 UNION ALL SELECT 3;

SELECT column1 FROM dbo.main_table
WHERE column1 IN (SELECT column1 FROM dbo.other_table);

Results:

column1
-------
1
2

Why doesn't that raise an error? SQL Server is looking at your queries and seeing that the column1 inside can't possibly be in other_table, so it is extrapolating and "using" the column1 that exists in the outer referenced table (just like you could reference a column that only exists in the outer table without a table reference). Think about this variation:

SELECT [column1] FROM dbo.main_table
WHERE EXISTS (SELECT [column1] FROM dbo.other_table WHERE [column2] = [column1]);

Results:

column1
-------
1

Again SQL Server knows that column1 in the where clause also doesn't exist in the locally referenced table, but it tries to find it in the outer scope. So in an imaginary world you might consider the query to actually be saying:

SELECT m.[column1] FROM dbo.main_table AS m
WHERE EXISTS (SELECT m.[column1] FROM dbo.other_table AS o WHERE o.[column2] = m.[column1]);

(Which is not how I typed it, but if I do type it that way, it still works.)

It doesn't make logical sense in some of the cases but this is the way the query engine does it and the rule has to be applied consistently. In your case (no pun intended), you have an extra complication: case sensitivity. SQL Server didn't find FIELD in your subquery, but it did find it in the outer query. So a couple of lessons:

  1. Always prefix your column references with the table name or alias (and always prefix your table references with the schema).
  2. Always create and reference your tables, columns and other entities using consistent case. Especially when using a binary or case-sensitive collation.
Community
  • 1
  • 1
Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
  • Great answer and very well explained, thanks. Lesson 1 is the one for me. Lesson 2 is a bit trickier as I deal with many, many huge datasets from various sources, some I have control over, others I do not (this was an external dataset, my original issue arose from the field in question always being capitalised in all other datasets, but lowercase in this one) I'll have a read of your blogs later, thanks again! Chris – Chris Jun 01 '12 at 13:13
  • If you are using binary or case sensitive collation, you'd better get into the habit of typing the case correctly... sometimes those errors won't be caught at compile time due to deferred name resolution, and they're very embarrassing when they aren't caught until later. – Aaron Bertrand Jun 01 '12 at 13:22
  • I work in an environment where our key datasets are all in case sensitive (well, binary) collation - and I prefer it that way! In this instance I was using a dataset from someone else, who named our main linking identifier (wrongly) in lower case and simply didn't check the dataset for the column name, received no error and assumed all was ok. My embarrassment was limited as luckily a colleague noticed something wasn't right, I hadn't realised that our IT set up the server to be CI, but in this situation that was a red herring anyway. – Chris Jun 01 '12 at 13:39
3

Very interesting find. The unspoken mandate is that you always should alias tables in your subqueries and use those aliases to be explicit about which table your column comes from. Subqueries allow you to make reference to a field from your outer query which is the cause of your issue, but in your scenario I would agree that either the default should be the internal query's field list, or to give you a column ambiguity error. Regardless, this method below is always preferable:

select * from main_table a where a.field in (select x.field from other_table x)

mattmc3
  • 17,595
  • 7
  • 83
  • 103
  • This certainly looks ike the right answer, clear and straight to the point, and does solve my problem. Thanks! – Chris Jun 01 '12 at 13:06