0

Saw a colleague run a query today and I couldn't explain why or how it didnt fail. The query was (abstracted):

UPDATE table1
SET columnToUpdate = 1
WHERE recordID IN (
SELECT recordID FROM table2 WHERE table2column IN ( *list of values*))

The problem was he had mistaken recordID, which is only present in table1, for personID, which is present in both tables and is a PK in table 1, FK in table 2.

I would have thought the query would not have run, due to the subselect being unable to resolve (indeed, if you run the subselect alone, it fails). Instead, running the above query updated all records in table1, which were previously a combination of NULL and 0.

It was simple to fix, but can anyone explain why this behaviour would take place?

Thanks!

David
  • 235
  • 1
  • 6
  • 19
  • 1
    what you mean inner select fail?you mean give you an error or return null? – Juan Carlos Oropeza Nov 21 '16 at 17:03
  • 2
    This is how correlated subqueries work. Sure it is easy to fix but if you change your coding habits to ALWAYS specify the table in front of your columns you will greatly minimize these types of errors. – Sean Lange Nov 21 '16 at 17:05

2 Answers2

3

for these reasons, always alias your tables and use the alias in the field name, otherwise a simple mistake can make SQL (correctly) interpret the field as coming from a table other than what you intend it to - the recordid in your query is the one from Table1 if it is not in table2

UPDATE table1 t1 
 SET t1.columnToUpdate = 1
  WHERE t1.recordID IN (
   SELECT t2.recordID FROM table2 t2 WHERE t2.table2column IN ( *list of values*))
Cato
  • 3,652
  • 9
  • 12
2

It is called Correlated sub-query. It helps you to refer outer query columns inside the sub-query

The recordID column is referred from table1 in the sub-query not from the table2 that is why it is failing when you run the sub-query alone. Normally you can see outer query columns referred in Where clause when EXISTS/NOT EXISTS used

Pரதீப்
  • 91,748
  • 19
  • 131
  • 172
  • Good answer, and I can see why it runs now, but the follow up question is then- why did it update every value in table1? Is having in the select essentially producing a cartesian product of everything in table1 against nothing in table2 to check against? (i.e. all of table1) – David Nov 21 '16 at 17:21
  • 1
    it updated all records because if any rows in table2 are selected, then you selected Table1.RecordId as the single value in those row - therefore Table1.RecordId IN (Table1.RecordId) is true, if any rows at all are selected – Cato Nov 21 '16 at 17:29