0

Whenever I read how to use a JOIN to UPDATE a table, the usage of aliases always throws me off just slightly. For example, take this code from here

UPDATE ud
SET assid = s.assid
FROM ud u
JOIN sale s ON u.id=s.id

It's clear to me that assid comes from the ud table on the line 1. What isn't clear to me is why ud needed to be aliased on lines 3 and 4. This isn't an isolated incident. For example, another answer on that same page is

update u
set u.assid = s.assid
from ud u
    inner join sale s on
        u.id = s.udid

which uses the alias u in a great many places.

This gives me my question: What exactly are the rules for what you do/don't need to alias when trying to use a JOIN to UPDATE? Directly quoting from any relevant docs would be appreciated, but not required.

J. Mini
  • 1,868
  • 1
  • 9
  • 38

3 Answers3

2

Note that none of this is mysterious if you use a CTE instead of UPDATE ... FROM, eg start with a SELECT that returns the rows you want to update

  select u.id, u.assid, s.assid new_assid 
  FROM ud u
  JOIN sale s ON u.id=s.id

. Examine and validate the data, then wrap it in a CTE and UPDATE it

with q as
(
  select u.id, u.assid, s.assid new_assid 
  FROM ud u
  JOIN sale s ON u.id=s.id
)
update q set assid = new_assid
David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67
2

The documentation calls this out specifically:

FROM <table_source>

... snip ...

If the object being updated is the same as the object in the FROM clause and there is only one reference to the object in the FROM clause, an object alias may or may not be specified. If the object being updated appears more than one time in the FROM clause, one, and only one, reference to the object must not specify a table alias. All other references to the object in the FROM clause must include an object alias.

You do not have to specify an alias for any table. If you do not then the table can obviously only be specified once.

But if you do give an alias then it depends: if the table exists in the query once then either with or without an alias is OK. Otherwise you must use the alias.

To be honest, as @AaronBertrand says, do yourself and everyone a favour and always use an alias on each table, and specify that alias in the UPDATE alias SET part.

Equally you should always use table aliases for each column reference in the joins or expressions. This is the same as you should do in any SELECT.

However, I don't think it's necessary to use it on the left-hand-side columns being assigned to, as these can only ever be from a single table/view/alias: the one directly after the keyword UPDATE, so adding table aliases is just verbiage.
In SQL dialects such as MySQL which do allow this then you should specify the table alias even on the left-hand-side.

column_name

Is a column that contains the data to be changed. column_name must exist in table_or view_name.

So a recommended statement would look like this:

UPDATE u     -- use the alias mentioned below
SET
  assid = s.assid    -- lh-side doesn't need alias, right-side does
FROM ud u    -- alias all tables
JOIN sale s ON u.id = s.id    -- add table aliases to all columns in joins
Charlieface
  • 52,284
  • 6
  • 19
  • 43
1

Technically, you don't need to use an alias prefix on a column reference:

  • in the left-hand side of SET since, logically, that column can't possibly come from any other table
  • if there are no column name conflicts that need disambiguation (and you know for sure you will never add columns to any of the involved tables that could become a conflict)

IMHO you shouldn't worry about any "rules" about when you can get away with not prefixing, and just always reference the alias a column belongs to. Why not just be clear and explicit always?

If you're looking for "official" rules, I think you'll be out of luck. Nothing about this could be found in this doc, though the first example has this, which is a little overboard IMHO (and doesn't use aliases):

UPDATE dbo.Table2   
SET dbo.Table2.ColB = dbo.Table2.ColB + dbo.Table1.ColB  
FROM dbo.Table2   
    INNER JOIN dbo.Table1   
    ON (dbo.Table2.ColA = dbo.Table1.ColA);  

On this site, many long-standing canonical examples are fully explicit on all column references, e.g. this one.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
  • It's not just over the top, it's [deprecated](https://learn.microsoft.com/en-us/sql/database-engine/deprecated-database-engine-features-in-sql-server-2016?redirectedfrom=MSDN&view=sql-server-ver15). And the docs do specify the rules – Charlieface May 01 '22 at 11:27
  • @Charlieface I guess I meant specifically about the two bullet points I mentioned: on the left-hand side of `SET` and when a column name can only exist in one of the tables (for now). I also don't take much meaning from `must not specify a table alias`. – Aaron Bertrand May 01 '22 at 17:16
  • Probably should be "may" rather than "must". It's coming from the other perspective: if you have *not* used the alias in the `UPDATE ... SET` and there are multiple references then exactly one reference may have no alias. I agree it's rather unclear, and could do with cleaning up and clarifying, as could many of the examples. I also agree that you should *always* specify the alias to the left of `SET`. But I don't think it's necessary on each column reference being assigned to (as opposed to part of the expression to assign) as only one table/view can ever be updated in a single `UPDATE` – Charlieface May 01 '22 at 19:35