I need to load data from TSV files into a table which has 7 foreign keys. I found a suggestion online that it could be done through a proxy table that has columns matching the input files and a trigger on insert event that populates the data to the destination table and its parents with FK constraint. I started testing it and I noticed that the parent tables have their PK IDs incremented by more than 1, I mean instead of the expected sequence of 1, 2, 3, ... I'm getting something like 1, 63, 539, ... . Have a look at my trigger code:
create trigger trig1 before insert on temp_table
for each row
begin
-- Set FKs
declare ts_id BIGINT UNSIGNED;
declare cluster_id BIGINT UNSIGNED;
declare queue_id BIGINT UNSIGNED;
declare service_class_id BIGINT UNSIGNED;
declare project_id BIGINT UNSIGNED;
declare job_group_id BIGINT UNSIGNED;
declare user_id BIGINT UNSIGNED;
insert into timeline (ts) values (NEW.ts) on duplicate key update id=last_insert_id(id);
select last_insert_id() into ts_id;
insert into clusters (name) values (NEW.cluster_name) on duplicate key update id=last_insert_id(id);
select last_insert_id() into cluster_id;
insert into queues (name) values (NEW.queue_name) on duplicate key update id=last_insert_id(id);
select last_insert_id() into queue_id;
insert into service_classes (name) values (NEW.service_class) on duplicate key update id=last_insert_id(id);
select last_insert_id() into service_class_id;
insert into projects (code) values (NEW.project_code) on duplicate key update id=last_insert_id(id);
select last_insert_id() into project_id;
insert into job_groups (name) values (NEW.job_group) on duplicate key update id=last_insert_id(id);
select last_insert_id() into job_group_id;
insert into users (name) values (NEW.user_name) on duplicate key update id=last_insert_id(id);
select last_insert_id() into user_id;
-- Insert a new row to the dest table
insert into dest_table values(ts_id, cluster_id, queue_id, service_class_id, project_id, job_group_id, user_id, NEW.job_count, NEW.slot_count, NEW.mem_req, NEW.mem_used);
end;
It seems to me that in the statements 'insert into ... on duplicate key update id=last_insert_id(id);' whenever the duplicate insert occurs the 'update' affect the last insert id even though there is no change to the parent table. Please let me know what you think - is my guess about this auto_increment behaviour correct and how to prevent such issue. In production this issue may cause the PKs in parent tables to reach to max value quicker that expected, so I would like to avoid that.