514

I have this table in a postgres 8.4 database:

CREATE TABLE public.dummy
(
  address_id SERIAL,
  addr1 character(40),
  addr2 character(40),
  city character(25),
  state character(2),
  zip character(5),
  customer boolean,
  supplier boolean,
  partner boolean
  
)
WITH (
  OIDS=FALSE
);

I want to update the table. Initially i tested my query using this insert statement:

insert into address customer,supplier,partner
SELECT  
    case when cust.addr1 is not null then TRUE else FALSE end customer, 
    case when suppl.addr1 is not null then TRUE else FALSE end supplier,
    case when partn.addr1 is not null then TRUE else FALSE end partner
from (
    SELECT *
        from address) pa
    left outer join cust_original cust
        on (pa.addr1=cust.addr1 and pa.addr2=cust.addr2 and pa.city=cust.city 
            and pa.state=cust.state and substring(cust.zip,1,5) = pa.zip  )
    left outer join supp_original suppl 
        on (pa.addr1=suppl.addr1 and pa.addr2=suppl.addr2 and pa.city=suppl.city 
                and pa.state=suppl.state and pa.zip = substring(suppl.zip,1,5))
    left outer join partner_original partn
        on (pa.addr1=partn.addr1 and pa.addr2=partn.addr2 and pa.city=partn.city
                  and pa.state=partn.state and pa.zip = substring(partn.zip,1,5) )
where pa.address_id = address_id

How can I convert this to an update statement, i.e., update existing rows using values returned from a select statement?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
stackover
  • 6,275
  • 6
  • 22
  • 20

8 Answers8

1074

Postgres allows:

UPDATE dummy
SET customer=subquery.customer,
    address=subquery.address,
    partn=subquery.partn
FROM (SELECT address_id, customer, address, partn
      FROM  /* big hairy SQL */ ...) AS subquery
WHERE dummy.address_id=subquery.address_id;

This syntax is not standard SQL, but it is much more convenient for this type of query than standard SQL. I believe Oracle (at least) accepts something similar.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
Andrew Lazarus
  • 18,205
  • 3
  • 35
  • 53
  • it seems that i'm trying for a bit different thing for eg. if there are 3 bool columns c1,c2,c3 all set to false initially. but based on subquery are set to true. update set c1=TRUE where id in (subquery1),set c2=TRUE where id in (subquery2), set c3=True where id in (subquery3). I was successful when i do split this as 3 updates but i'm not sure how to attain the result with a single update. hope this make sense. – stackover Jun 08 '11 at 16:13
  • 4
    FWIW, Oracle does accept that basic construct, however the performance of the update tends to degrade severely as the tables get larger. That's o.k. though as Oracle also supports the MERGE statement. – gsiems Mar 11 '15 at 01:41
  • 6
    This totally does not work in postgresql 9.5, I get `ERROR: 42P01: relation "dummy" does not exist` – user9645 Oct 14 '16 at 14:37
  • 194
    `dummy` has to be replaced by the name of the table you are trying to update. Please understand question and answer before trying to apply. – Andrew Lazarus Oct 14 '16 at 14:39
  • 6
    Yes, sorry about that. My bad. Did not notice the original question was using table named "dummy" – user9645 Oct 14 '16 at 15:08
  • 4
    It may be worth to mention that in the beginning of the query it is not necessary to specify the path to the column of the left side, only at the end, otherwise the db will complain with ERROR: column reference "address_id" is ambiguous – OJVM Apr 19 '17 at 17:26
  • 1
    This way of doing the query is much slower when i benchmarked against using WITH CTE AS( – HendrikThurauEnterprises Jan 23 '23 at 09:50
  • @HendrikThurauEnterprises This surprises me, but I suppose it depends on the contents of the "Big Hairy SQL". – Andrew Lazarus Jan 23 '23 at 23:07
  • 1
    @AndrewLazarus I was also surprised, my query is comparing IP tables and updating empty data, joined by IP-IP CIDR. WITH CTE is 32sec compared to 1min 5sec FROM (SELECT – HendrikThurauEnterprises Jan 24 '23 at 15:04
224

You're after the UPDATE FROM syntax.

UPDATE 
  table T1  
SET 
  column1 = T2.column1 
FROM 
  table T2 
  INNER JOIN table T3 USING (column2) 
WHERE 
  T1.column2 = T2.column2;

References

Madbreaks
  • 19,094
  • 7
  • 58
  • 72
Brian Webster
  • 30,033
  • 48
  • 152
  • 225
113

If there are no performance gains using a join, then I prefer Common Table Expressions (CTEs) for readability:

WITH subquery AS (
    SELECT address_id, customer, address, partn
    FROM  /* big hairy SQL */ ...
)
UPDATE dummy
SET customer = subquery.customer,
    address  = subquery.address,
    partn    = subquery.partn
FROM subquery
WHERE dummy.address_id = subquery.address_id;

IMHO a bit more modern.

Kerem
  • 11,377
  • 5
  • 59
  • 58
steevee
  • 2,238
  • 1
  • 21
  • 16
  • 3
    The syntax isn't compatible with older versions of Postgres, before v9.1, (see https://www.postgresql.org/docs/9.1/static/sql-update.html and the previous versions) I'm on v8.2, so you have to put the entire CTE/With statement inside brackets after the FROM keyword and it will work. – Hansang Dec 11 '17 at 04:06
  • 2
    This is what I looked everywhere for. Thanks! – Y Y Sep 19 '21 at 07:19
  • 4
    Without wanting to take anything away from my top-rated answer across Stack Exchange, this syntax, which did not exist when my answer was written, is a little easier to read. – Andrew Lazarus May 02 '22 at 20:56
  • 2
    Performing a complicated update over 30k rows, the WITH syntax took 32seconds, the FROM (SELECT ... ) took 1minute 5seconds – HendrikThurauEnterprises Jan 23 '23 at 09:47
96

There are many ways to update the rows.

When it comes to UPDATE the rows using subqueries, you can use any of these approaches.

  1. Approach-1 [Using direct table reference]
UPDATE
  <table1>
SET
  customer=<table2>.customer,
  address=<table2>.address,
  partn=<table2>.partn
FROM
  <table2>
WHERE
  <table1>.address_id=<table2>.address_i;

Explanation: table1 is the table which we want to update, table2 is the table, from which we'll get the value to be replaced/updated. We are using FROM clause, to fetch the table2's data. WHERE clause will help to set the proper data mapping.

  1. Approach-2 [Using SubQueries]
UPDATE
  <table1>
SET
  customer=subquery.customer,
  address=subquery.address,
  partn=subquery.partn
FROM
  (
    SELECT
      address_id, customer, address, partn
    FROM  /* big hairy SQL */ ...
  ) AS subquery
WHERE
  dummy.address_id=subquery.address_id;

Explanation: Here we are using subquerie inside the FROM clause, and giving an alias to it. So that it will act like the table.

  1. Approach-3 [Using multiple Joined tables]
UPDATE
  <table1>
SET
  customer=<table2>.customer,
  address=<table2>.address,
  partn=<table2>.partn
FROM
  <table2> as t2
  JOIN <table3> as t3
  ON
    t2.id = t3.id
WHERE
  <table1>.address_id=<table2>.address_i;

Explanation: Sometimes we face the situation in that table join is so important to get proper data for the update. To do so, Postgres allows us to Join multiple tables inside the FROM clause.

  1. Approach-4 [Using WITH statement]
  • 4.1 [Using simple query]
WITH subquery AS (
    SELECT
      address_id,
      customer,
      address,
      partn
    FROM
      <table1>;
)
UPDATE <table-X>
SET customer = subquery.customer,
    address  = subquery.address,
    partn    = subquery.partn
FROM subquery
WHERE <table-X>.address_id = subquery.address_id;
  • 4.2 [Using query with complex JOIN]
WITH subquery AS (
    SELECT address_id, customer, address, partn
    FROM
      <table1> as t1
    JOIN
      <table2> as t2
    ON
      t1.id = t2.id;
    -- You can build as COMPLEX as this query as per your need.
)
UPDATE <table-X>
SET customer = subquery.customer,
    address  = subquery.address,
    partn    = subquery.partn
FROM subquery
WHERE <table-X>.address_id = subquery.address_id;

Explanation: From Postgres 9.1, this(WITH) concept has been introduced. Using that we can make any complex queries and generate desired result. Here we are using this approach to update the table.

I hope, this would be helpful...

Mayur
  • 4,345
  • 3
  • 26
  • 40
  • 2
    Nicely put @Mayur thanks for taking the time to compile all of the answers in one. I personally go with number 4, because to me it's the most readable one (via separating both queries) and the most flexible since you can put whatever you need inside the WITH – juanm55 Apr 21 '21 at 15:33
  • Glad to know that My answers are helpful. Keep Coding :) – Mayur Apr 22 '21 at 12:12
  • Thanks for putting up all the options. How about the performance? Are they about the same? – newman Jan 10 '22 at 22:30
  • Unless the standard has changed, none of these options are the SQL standard. The `FROM` clause used to be an extension, while the standard required the obviously inferior `UPDATE dummy SET col1 = (SELECT col1 FROM some_other table), col2= (SELECT col2 FROM some_other_table)`—longer and repetitive. – Andrew Lazarus May 02 '22 at 20:59
12

@Mayur "4.2 [Using query with complex JOIN]" with Common Table Expressions (CTEs) did the trick for me.

WITH cte AS (
SELECT e.id, e.postcode
FROM employees e
LEFT JOIN locations lc ON lc.postcode=cte.postcode
WHERE e.id=1
)
UPDATE employee_location SET lat=lc.lat, longitude=lc.longi
FROM cte
WHERE employee_location.id=cte.id;

Hope this helps... :D

Festus Ngor
  • 301
  • 2
  • 6
4
update json_source_tabcol as d
set isnullable = a.is_Nullable
from information_schema.columns as a 
where a.table_name =d.table_name 
and a.table_schema = d.table_schema 
and a.column_name = d.column_name;
2

Another way with limit:

UPDATE tableA 
SET column = 'new value'
WHERE tableA.id in (
          SELECT tableA.id FROM tableA 
            JOIN tableB ON tableA.tableB_id = tableB.id 
           WHERE tableB.someColumn = 'XYZ' 
           LIMIT 10)
1

For PostgreSQL check https://www.postgresql.org/docs/current/sql-update.html

UPDATE tableA SET (addr1, adrr2) =
    (SELECT addr1, addr2 FROM tableB
     WHERE tableA.id = tableB.tableA_id);