1

I have a big table all_records with 100 million records and many columns. I want to insert roughly 30 million records inside this table to another table records_not_used_now based on time-consuming operations of multi columns (the operations are named as f(n) in the following query). Then delete those records from the original table all_records.

The query is like this:

insert into records_not_used_now
select * from all_records  where f(n) > 0;

delete from all_records  where f(n) > 0;

However, as the f(n) > 0 judgment costs computation time, I do not want to do it twice. Is there any query in MySQL that can select the record out of the table and then delete them at the same time? The query helps to avoid running the same judgment and scanning the table twice.

  • What about a Trigger? You can add on insert Trigger on your `records_not_used_now` table and whenever insertion happens delete the data from `all_records` by ID – Amir Saleem May 03 '21 at 04:59
  • As far as I know, there's no query to select the record out of the table and then delete them at the same time – Christophorus Reyhan May 03 '21 at 05:07
  • @AmirSaleem isn't that still uses 2 operations? – Christophorus Reyhan May 03 '21 at 05:08
  • Yes it will require 2 operations but operation on a primary key takes constant time. When deleting from a trigger you are not using `f(n)` but `ID` of that row. – Amir Saleem May 03 '21 at 05:10
  • what you are doing seems right but do it in transaction otherwise you lose some data. there you go amir ;). – Ali Momeni May 03 '21 at 05:12
  • What does `f(n)` do exactly? Maybe you can optimize it so that the engine profits from indexes and the cost isn't that much. If this is a frequent operation, the optimization will sure pay off, but for very rare executions, it may be simply better to just wait a bit more. – Alejandro May 03 '21 at 05:15
  • https://stackoverflow.com/questions/39087472/insert-and-delete-in-the-same-transaction can you check the link. If you get the answer. – snehal gugale May 03 '21 at 05:16
  • Also, remember to enclose both the `INSERT` and the `DELETE` in a transaction so that you don't lose consistency. – Alejandro May 03 '21 at 05:16
  • Whatever `f(n)` really does: why not compute it before running the queries and store the result in a variable? – Nico Haase May 03 '21 at 05:34
  • I think the key point is that we need to do it in one transaction to keep the consistency and use a key to stored the result to avoid recalculation and rescanning. – Lawrance Zhang May 03 '21 at 06:11

2 Answers2

1

As per your explanation, you are moving the some 30 M records from table to another and you want to avoid repetition of f(n) > 0 computation job due to high compute time.

I suggest to do this in 4 specific steps.

  1. Create a temporary table with the select query.
CREATE TEMPORARY_TABLE TABLE IF NOT EXISTS table2 AS (select * from all_records  where f(n) > 0;)
  1. Delete the records in the source table(all_records) using a sql join between this new temporary table and source table

  2. Insert all the records from temporary table to the destination table.

INSERT INTO records_not_used_now SELECT * FROM TEMPORARY_TABLE;
  1. Drop the temporary table
DROP TABLE TEMPORARY_TABLE ; 

Refs:

Dharman
  • 30,962
  • 25
  • 85
  • 135
sam
  • 1,819
  • 1
  • 18
  • 30
0

Create a new column named f_on_n and run

UPDATE all_records   set f_of_n = f(n);
insert into records_not_used_now select * from all_records  where f_of_n > 0;
delete from all_records  where f_of_n > 0;
Victorqedu
  • 484
  • 4
  • 20