This isn't a bug in SQL Server/the T-SQL dialect parsing, no, this is working exactly as intended. The problem, or bug, is in your T-SQL; specifically because you haven't qualified your columns. As I don't have the definition of your table, I'm going to provide sample DDL first:
CREATE TABLE dbo.Table1 (MyColumn varchar(10), OtherColumn int);
CREATE TABLE dbo.Table2 (YourColumn varchar(10) OtherColumn int);
And then an example that is similar to your query:
SELECT MyColumn
FROM dbo.Table1
WHERE MyColumn IN (SELECT MyColumn FROM dbo.Table2);
This, firstly, will parse; it is a valid query. Secondly, provided that dbo.Table2
contains at least one row, then every row from table dbo.Table1
will be returned where MyColumn
has a non-NULL
value. Why? Well, let's qualify the column with table's name as SQL Server would parse them:
SELECT Table1.MyColumn
FROM dbo.Table1
WHERE Table1.MyColumn IN (SELECT Table1.MyColumn FROM dbo.Table2);
Notice that the column inside the IN
is also referencing Table1
, not Table2
. By default if a column has it's alias omitted in a subquery it will be assumed to be referencing the table(s) defined in that subquery. If, however, none of the tables in the sub query have a column by that name, then it will be assumed to reference a table where that column does exist; in this case Table1
.
Let's, instead, take a different example, using the other column in the tables:
SELECT OtherColumn
FROM dbo.Table1
WHERE OtherColumn IN (SELECT OtherColumn FROM dbo.Table2);
This would be parsed as the following:
SELECT Table1.OtherColumn
FROM dbo.Table1
WHERE Table1.OtherColumn IN (SELECT Table2.OtherColumn FROM dbo.Table2);
This is because OtherColumn
exists in both tables. As, in the subquery, OtherColumn
isn't qualified it is assumed the column wanted is the one in the table defined in the same scope, Table2
.
So what is the solution? Alias and qualify your columns:
SELECT T1.MyColumn
FROM dbo.Table1 T1
WHERE T1.MyColumn IN (SELECT T2.MyColumn FROM dbo.Table2 T2);
This will, unsurprisingly, error as Table2
has no column MyColumn
.
Personally, I suggest that unless you have only one table being referenced in a query, you alias and qualify all your columns. This not only ensures that the wrong column can't be referenced (such as in a subquery) but also means that other readers know exactly what columns are being referenced. It also stops failures in the future. I have honestly lost count how many times over years I have had a process fall over due to the "ambiguous column" error, due to a table's definition being changed and a query referencing the table wasn't properly qualified by the developer...