0

I am having a function which delete more than 1000000 record it takes around 1 min.

CREATE OR REPLACE FUNCTION DeleteRecord (cmpId integer)
RETURNS boolean AS
$BODY$
DECLARE cmpId INTEGER DEFAULT 0;
DECLARE counter INTEGER := 1 ; 
DECLARE batch    INTEGER := 0 ; 
DECLARE batchSize    INTEGER :=50000 ; 
BEGIN

     SELECT count(*)/batchSize INTO batch FROM department WHERE cmpId= cmpId;
     WHILE counter <= batch + 1 LOOP
          DELETE FROM department WHERE deptId IN ( SELECT deptId FROM
           department WHERE cmpId = cmpId limit batchSize); 
          RAISE NOTICE 'Counter: %', counter;
          counter := counter + 1 ; 
       END LOOP ; 

     counter := 1 ;
     SELECT count(*)/batchSize INTO batch FROM employee WHERE cmpId = cmpId;
     WHILE counter <= batch + 1 LOOP
          DELETE FROM empoyee WHERE empId IN ( SELECT empId FROM
           employee  WHERE cmpId = cmpId limit batchSize);  
          RAISE NOTICE 'Counter: %', counter;
          counter := counter + 1 ; 
       END LOOP ; 

END; 
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
ALTER FUNCTION DeleteRecord(integer)
  OWNER TO postgres;

Department table holds 500000 record for cmpId

Employee table hold more 500000 records for cmpId

I tried to delete with simple delete also but it took more than 1 min.

I found one link How do I speed up deletes from a large database table?

which tells to do delete operation in batch.

But it does helps more time reduced to by just 2 to 3 sec.

Hemant Malpote
  • 891
  • 13
  • 28
  • 2
    I don't think that deleting in batches will be any faster then doing that in a single statement. Doing things in a loop is usually a lot slower then doing it with a single statement. Solutions for SQL Server won't necessarily be correct for Postgres. –  Nov 17 '17 at 07:43
  • Then what would alternate for it. – Hemant Malpote Nov 17 '17 at 07:46
  • Partitioning might be the best way to delete many records at a time. See https://www.postgresql.org/docs/10/static/ddl-partitioning.html – StephaneM Nov 17 '17 at 08:01
  • 1
    Why do you deleting this in `PL/pgSQL`? Single statement should be waay faster. First of all try to force index usage in delete by replacing `in` with `some` (which version of `pg` are you using?) Post `explain` of the delete statement please. – JustMe Nov 17 '17 at 08:08
  • One option to speed up bulk deletes is to drop all indexes before doing that, then re-create them afterwards. However this is only an option if nothing else writes to that table during the delete process. –  Nov 17 '17 at 08:12

1 Answers1

1

There is not really much you can do to speed up large deletes.

Increase max_wal_size if checkpoint are occurring to frequently and get faster storage.

If you face that kind of problem regularly, consider partitioning your table. Dropping a partition is much faster than a mass delete.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263