I wanted to warn people about a problem I hit (luckily in a test environment and not LIVE) when using the SQL IN statement as part of the WHERE clause.
I built a temporary table called [dataimport].[IDsToUpdate], which contained a list of 42 GUIDs that I wanted to update using the following UPDATE statement:
UPDATE dbo.IndividualSponsors
SET LeaveDate = null
WHERE IndividualSponsorID in (select IndividualSponsorID from [dataimport].[IDsToUpdate])
The [dataimport].[IDsToUpdate]
table contains a single GUID column called IndvSponsorID. Note the error in my UPDATE statement. It should have been:
select IndvSponsorID from [dataimport].IDsToUpdate
If I run select IndividualSponsorID from [dataimport].IDsToUpdate
on its own, it errors saying that the column IndividualSponsorID
is unknown (as expected).
But when I run it inside the UPDATE statement above, it updated all 148,000 records in my dbo.IndividualSponsors
table (arrrrhhh !!!).
I believe this is because the column name IndividualSponsorID
exists in the table dbo.IndividualSponsors, and when run in the full context of the UPDATE statement, SQL Server is re-pointing the column reference to that table hence the blanket update of all rows.
This seems VERY wrong to me, especially as I explicitly stated that the column IndividualSponsorID
is FROM the table [dataimport].IDsToUpdate
. It should have error'ed, as my IN SELECT statement was incorrect.
Morale: ALWAYS use table alias names when doing UPDATES (especially to live data!):
e.g. select ids.IndividualSponsorID from [dataimport].IDsToUpdate ids
This fails as expected.
Does anyone have a different view on this?