0

Database: PostgreSQL RDS on AWS AWS EC2 Instance with 8GB RAM

  1. I have a scheduled task in Java Spring that runs every midnight 00:00:01. Which fetches the data older than currentDateTime minute 15 days. (mostly 1.5+M rows everyday)

  2. The data is than written in a CSV, The CSV is then Zipped, and the ZIP is then uploaded to S3.

  3. All the data that was written in CSV is then DELETED from the table using Native query.

DELETE FROM MY_TABLE WHERE id>='1' AND id<='10000000'; 

The ID range is obtained from the LIST at Step 1.

Entire process up until uploading to S3 is completed in almost 1.5 minutes. But DELETE query often crashes the EC2 instance.

Please note:

  1. I have indexes on the ID table

  2. Entire process has to be automated, so it has to be done through code scheduler only.

  3. Dumping data to temp_table is not an option, as this data is useless

  4. Tried solutions given [here][1]

  5. The delete method is annotated with @Async [1]: https://stackoverflow.com/a/72725343

    @Async private void deleteLog(Long startId, Long lastId){ deviceAppStatusLogRepository.deleteBetween(startId, lastId); }

  • Is the database crashing, or the application server? Do you get any kind of logs from the crash? – Jens Schauder Sep 16 '22 at 06:18
  • Sounds like you hit the DB transaction log high water mark: Do it in batches of 10000 rows at a time. – Bohemian Sep 16 '22 at 06:28
  • 1
    @Bohemian there is no such thing as a "*transaction log high water mark*" in Postgres. Is that a RDS thing? –  Sep 16 '22 at 06:47
  • @JensSchauder DB doesn't crash, as it is managed by AWS RDS. But Most likely Java app ran out of memory, waiting for the logs from my devOps to see the Exception message. – Gladiator9120 Sep 16 '22 at 06:52
  • @Bohemian batching is not possible, have to do it in one shot or don't do it at all. – Gladiator9120 Sep 16 '22 at 06:53
  • I am currently trying this to see if this achieves what I need : https://stackoverflow.com/a/32728314/9674183 – Gladiator9120 Sep 16 '22 at 06:53
  • I don't see how this could crash the application, since the JDBC statement executed returns just a single number. Doesn't need much memory to handle that. I consider a database problem way more likely. – Jens Schauder Sep 16 '22 at 08:19
  • @JensSchauder Got the logs from devOps, OutOfMemory Exception. It's development server with just 4GB RAM. Testing on local : I now changed the logic from creating CSV using JAVA to let PostGRE COPY CSV to /tmp folder, thinking that may reduce the CPU and Memory Overload. But still CPU usage is 100% for almost 3 hours now, and RAM almost 15GB / 16 used Following this answer: https://stackoverflow.com/a/32728314/9674183 – Gladiator9120 Sep 16 '22 at 08:53

0 Answers0