0

I'm trying to update a table from another one, the request ends up successful but no rows are updated. Each table contains ~32M rows. I'm running on PostgreSQL 11.12.

Here's the 2 tables (i've removed columns not used in the request):

CREATE TABLE IF NOT EXISTS public.sirene_geo
(
    siret character varying(50) NOT NULL,
    x numeric,
    y numeric,
    CONSTRAINT sirene_geo_etablissement_pkey PRIMARY KEY (siret)
)

CREATE TABLE IF NOT EXISTS public.sirene_eta
(
    siret character varying(50) NOT NULL,  
    latitude numeric,
    longitude numeric,
    CONSTRAINT sirene_stock_etablissement_pk PRIMARY KEY (siret)
)

Update request:

UPDATE sirene_eta eta
SET longitude = x,
    latitude = y
FROM sirene_geo geo
WHERE eta.siret = geo.siret

On pgAdmin (v5.4), it is indicated -1 for "Rows Affected" field.
Postgres is using hash join strategy to complete the update. Also, there are less rows in sirene_geo than sirene_eta, still Postgres is building hash table on sirene_geo (thus leading to no match for some rows).

When I try an update with limit inside subquery table, it works but it is using nested loop strategy which is definitely not suitable to update the whole table.

Update:
There is no concurrent write activity. I've checked the log and indeed, there's an error:

ERROR: could not write to file "base/pgsql_tmp/pgsql_tmp9264.8256": No space left on device

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Bil11
  • 85
  • 1
  • 9
  • No such thing as "Postgre". https://wiki.postgresql.org/wiki/Identity_Guidelines Please disclose your version of Postgres and pgAdmin. My pgAdmin4 (current version 6.7) reports the standard command tag "UPATE 0" when no rows have been affected. I suspect you run into an error. Connection timeout? Or out of memory? You should see a proper error msg! Check your DB logs. – Erwin Brandstetter Mar 20 '22 at 02:49
  • Also, if there is concurrent write activity, you might easily run into deadlocks when updating 32M rows at once in random order ... Is there? – Erwin Brandstetter Mar 20 '22 at 02:58
  • Hi, great to know for the mispelling, I've edited my post. I'm currently running PgAdmin v5.4 and Postgres 11.12. There is no concurrent write activity. I've checked the log and indeed, there's an error : `ERROR: could not write to file "base/pgsql_tmp/pgsql_tmp9264.8256": No space left on device` – Bil11 Mar 20 '22 at 11:17
  • Please clarify via edits, not comments. Please don't add "EDIT"s, just edit to tthe best presentation possible. But don't edit in a way that invalidates reasonable posted answers. Please delete & flag obsolete comments. PS Now what is the question you are trying to ask? PS [mre] – philipxy Mar 20 '22 at 19:42

1 Answers1

0

You are running out of space on your storage device. Make room on disk (or whatever you use as storage) before starting the big UPDATE. Delete dispensable files (unrelated to the database). Or shrink your database somehow.

A plain VACUUM might do the job. Or VACUUM FULL (blocks concurrent access) to aggressively shrink physical storage. If you cannot afford to block, consider one of the non-blocking community tools. See:

VACUUM FULL preferably not on sirene_eta (the target table) which will reuse dead tuples in the UPDATE anyway (after a plain VACUUM). And make sure VACUUM is not blocked by a long running transaction. See:

Whatever else you do, if you don't expect that all targeted rows actually change, add a WHERE condition to filter empty updates (at full cost!)

UPDATE sirene_eta eta
SET    longitude = geo.x
     , latitude  = geo.y
FROM   sirene_geo geo
WHERE  eta.siret = geo.siret
AND   (eta.longitude IS DISTINCT FROM geo.x  -- !
    OR eta.latitude  IS DISTINCT FROM geo.y)

Might even fix your problem by reducing the work to be done (dramatically). (Turns out, it doesn't in your case.

See:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228