The following procedure is run on a table with 20M records. It stopped at ~560000 and I don't know why. I'm still learning DB monitoring/troubleshooting. This is on Aurora Postgres Serverless v2 with 6 ACUs (12 GB mem).
The content is html in a text
column.
create or replace procedure schema_v2_update_href_again()
language plpgsql
as $$
declare
max_id bigint;
begin
select max(version_id)
from versions
into max_id;
for id in 0..max_id loop
update versions
set content = REGEXP_REPLACE(content, 'href="./US/eng/1/article/1/', 'href="./US/eng/1/article/', 'g')
where version_id = id;
if id % 10000 = 0 then
raise notice 'id: %', id;
commit;
end if;
end loop;
end; $$;
Should I be looking at the DB monitoring tools or could this be something that I've missed in the plpgsql?
The goal of the query is to go through each record and replace a url pattern in the html. There is no performance requirement. I'm not trying to be fast or efficient, I just wanted a simple procedure. Is this method naive?
Update #1:
schema for versions
table
create table versions(
version_id bigserial primary key,
group_address entity_address not null default (0, 0),
part_id bigint not null references parts(part_id),
created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
content_type smallint not null,
content text not null,
root_entity_id bigint,
root_entity_table table_name
);
create index on versions using hash (content);
create index on versions (part_id);