2

I have the following InnoDB tables:

 CREATE TABLE `vehicle` (
  `ID` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `Name` varchar(50) DEFAULT NULL,
  `Model` varchar(100) DEFAULT NULL,
  `Engine_Type` varchar(70) DEFAULT NULL,
  `Construction_From` date DEFAULT NULL,
  `Construction_To` date DEFAULT NULL,
  `Engine_Power_KW` mediumint(8) unsigned DEFAULT NULL,
  `Engine_Power_HP` mediumint(8) unsigned DEFAULT NULL,
  `CC` mediumint(8) unsigned DEFAULT NULL,
  `TTC_TYP_ID` int(11) unsigned DEFAULT NULL,
  `Vehicle_Type` tinyint(1) DEFAULT NULL,
  `ID_Body_Type` tinyint(3) unsigned DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=49407 DEFAULT CHARSET=utf8;

CREATE TABLE `part` (
  `ID` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `ID_Brand` smallint(5) unsigned DEFAULT NULL,
  `Code_Full` varchar(50) DEFAULT NULL,
  `Code_Condensed` varchar(50) DEFAULT NULL,
  `Ean` varchar(50) DEFAULT NULL COMMENT 'The part barcode.',
  `TTC_ART_ID` int(11) unsigned DEFAULT NULL COMMENT 'TecDoc ID.',
  `ID_Product_Status` tinyint(3) unsigned DEFAULT NULL,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `TTC_ART_ID_UNIQUE` (`TTC_ART_ID`),
  UNIQUE KEY `ID_Brand_Code_Full_UNIQUE` (`ID_Brand`,`Code_Full`)
) ENGINE=InnoDB AUTO_INCREMENT=3732260 DEFAULT CHARSET=utf8;

CREATE TABLE `vehicle_part` (
  `ID_Vehicle` mediumint(8) unsigned NOT NULL,
  `ID_Part` int(11) unsigned NOT NULL,
  PRIMARY KEY (`ID_Vehicle`,`ID_Part`),
  KEY `fk_vehicle_part_vehicle_id_vehicle_idx` (`ID_Vehicle`),
  KEY `fk_vehicle_part_part_id_part_idx` (`ID_Part`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Table vehicle has about 45.000 records, table part has about 3.500.000 records and table vehicle_part has approximately 100.000.000 records. Creating the secondary indexes for vehicle_part did not take too long, about 30 min for both. What I cannot do though is create the foreign key constraints: for example

ALTER TABLE `vehicle_part` 
ADD CONSTRAINT `fk_vehicle_part_vehicle_id_vehicle`
  FOREIGN KEY (`ID_Vehicle`)
  REFERENCES `vehicle` (`ID`)
  ON DELETE NO ACTION
  ON UPDATE NO ACTION;

takes ages to complete. I understand the table is rebuilt since it consumes a lot of disk space. What can I do to improve the performance? If I create the table with the fk constraints and then add the records the insert process in vehicle_part also takes ages (about 3 days). I am using a laptop with 4GB RAM.

EDIT 12/01/2016

The answer given by Drew helped a lot in improving the performance dramatically. I changed every script using SELECT ... INTO outfile and then LOAD DATA INFILE from the exported csv file. Also sometimes before LOAD DATA INFILE dropping the indexes and recreating them after the load proccess saves even more time. There is no need to drop the fk constraints just the secondary indexes.

pankal
  • 124
  • 13
  • a small note ... inserting bulk of data in an indexed table is bad in performance ..... if you can drop the index then insert then build the index again the perf will be enhanced dramatically – osama yaccoub Dec 13 '15 at 17:44
  • Yes, this is what I did and the performance had improved, but the main problem is not the index but the foreign key constraint. – pankal Dec 13 '15 at 17:57
  • why don't you consider partitioning the table, looping on partitions and insert the data by partition – osama yaccoub Dec 13 '15 at 18:01
  • My opinion on FKs? Debug your code; then you won't need FK checks. – Rick James Dec 14 '15 at 06:12
  • @osamayaccoub - adding `PARTITIONing` will take a long time, and won't really help anything. – Rick James Dec 14 '15 at 06:13
  • I understand that you may put the checks in the application code, but let's assume someone sticks to the use of FKs. There are advantages and disadvantages to each approach, but in general I prefer to stay on the safe side and use these checks in the DB. – pankal Dec 14 '15 at 16:47

1 Answers1

2

If you know your data is pristine from an FK perspective, then establish your structure without secondary indexes as suggested in comments, but with the FK in the schema yet with FK checks temporarily disabled.

Load your data. If external data, certainly do it with LOAD DATA INFILE.

After your data is loaded, turn on FK checks. And establish secondary indexes with Alter Table.

Again, going with the assumption that your data is clean. There are other ways of proving that after-the-fact for the risk-adverse.

create table student
(   id int auto_increment primary key,
    sName varchar(100) not null
    -- secondary indexes to be added later
);

create table booksAssigned
(   id int auto_increment primary key,
    studentId int not null,
    isbn varchar(20) not null,
    constraint foreign key `fk_b_s` (studentId) references student(id)
    -- secondary indexes to be added later
);


insert booksAssigned(studentId,isbn) values (1,'asdf'); -- Error 1452 as expected

set FOREIGN_KEY_CHECKS=0; -- turn FK checks of temporarily

insert booksAssigned(studentId,isbn) values (1,'asdf'); -- Error 1452 as expected

set FOREIGN_KEY_CHECKS=1; -- succeeds despite faulty data

insert booksAssigned(studentId,isbn) values (2,'38383-asdf'); -- Error 1452 as expected

As per op comments, how to drop auto-generated index in referencing table after initial schema creation:

mysql> show create table booksAssigned;

| booksAssigned | CREATE TABLE `booksassigned` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `studentId` int(11) NOT NULL,
  `isbn` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_b_s` (`studentId`),
  CONSTRAINT `booksassigned_ibfk_1` FOREIGN KEY (`studentId`) REFERENCES `student` (`id`)
) ENGINE=InnoDB |

mysql> set FOREIGN_KEY_CHECKS=0;
Query OK, 0 rows affected (0.00 sec)

mysql> drop index `fk_b_s` on booksAssigned;
Query OK, 0 rows affected (0.49 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table booksAssigned;

| booksAssigned | CREATE TABLE `booksassigned` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `studentId` int(11) NOT NULL,
  `isbn` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  CONSTRAINT `booksassigned_ibfk_1` FOREIGN KEY (`studentId`) REFERENCES `student` (`id`)
) ENGINE=InnoDB |

Further links

Community
  • 1
  • 1
Drew
  • 24,851
  • 10
  • 43
  • 78
  • But if I leave the FK constraint InnoDB will automatically create a secondary index on the referencing column, no? Also the data are external (csv files) but I cannot use load data infile, because I need to join data from different csv files. – pankal Dec 13 '15 at 18:28
  • that is per the manual as a possibility I will grant you that. However if you do the `set FOREIGN_KEY_CHECKS=0;` and insert a few rows, and perform a `show create table booksAssigned`, you may find that is does not do that auto-patch of the index you suggest. – Drew Dec 13 '15 at 18:30
  • Yes, I always perform the inserts using FOREIGN_KEY_CHECKS=0. However the time the insert takes is too much. Even mysqldump afterwards takes a lot of time to export the data to the server. – pankal Dec 13 '15 at 18:49
  • I would never do it that way for import going the other way. http://stackoverflow.com/a/32702768 – Drew Dec 13 '15 at 18:52
  • I see. So I keep the FK Constraints and put the secondary indexes afterwards. I will try that and see if I get any serious performance hit. One more thing: with your last comment you say I should not use mysqldump at all when I want to transfer the data to the server? – pankal Dec 14 '15 at 16:52
  • Great. Let us know the details and your findings of the performance. Without details of how you did it (in detail :P) we can't re-play and confirm. – Drew Dec 14 '15 at 18:35
  • 1
    As for your last question in the comment 2 lines above, mysqldump is a powerful and flexible tool. What we are trying to do is to perform the fastest import possible. Short of using raw input files, or tools from 3rd parties, ... the fastest way to perform that is with CSV data obtained and later used via `select into outfile` and LOAD DATA INFILE – Drew Dec 14 '15 at 18:38