0

I have around 70 Million record in files, and i have to read all the record and insert it into MYSQL database. I take about 2 days to insert all the data, the issue is that now i have to add few conditions before adding them to the database. But when i add condition it take more days to complete. And this process is done every week. So if it take 4 days to read and sort the data in 7 days week, then there will be no use of it.

Is there any quick why so that it take less time to read the data.

Current File size: 500MB

The main condition is that when i read the data and it not meat the requirement then that data store in an other table.

I have an other option as well for this. If this will not done. I have to show data in front-end after the insertion. Let's suppose there is no quick way to minimize the time. so after the insertion of all data, i have to show this data on front-end and 100 records per page, so is there any way i make condition in pagination that it will show only 100 record which meat my requirements and next 100 which meat requirements.

Let suppose i Read 100 record and in 100 there is only 80 record which meat my requirements. so that 1st page must pull 20 more record which meat my requirements.

Here is The example of the data and how it show work

master_account_number,sub_account_number,current_balance,first_name,last_name,city,state,zip,client_id

01-962040041,01-000050003,11.25,IRENE,AULTMAN,MICHELLVILLE,MD,20721,000969

I need this format for every row. If any field is empty it will be consider as a bad data and store in other table. Or if format of the data not match then it also consider as a bad data. If balance is zero then also bad data, and many more conditions as well.

i am using

Codeigniter 
IIS 7 Server
MYSQL

Storing all data as a string

And i use this query to load the data into DB.

LOAD DATA INFILE 'C:\\inetpub\\wwwroot\\www.suburbancredit.com\\csv\\atest.csv'
INTO TABLE accounts
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
LINES TERMINATED BY '\n'
IGNORE 1 LINES 
(`accountNumber`, `caseNumber`, `current_balance`, `first_name`, `last_name` , `city`, `state`, `zip`, `client_id`)
Aadil Afzal
  • 63
  • 1
  • 11
  • What kind of conditions? The question has a PHP tag: Do you use a PHP script? If yes: can you show us the essential (short) part of it? Just guessing: disable indexes while importing; use prepared statements; ...but those are only the low hanging fruits; you have to tell us more specifics. – VolkerK Jul 01 '15 at 17:08
  • Please provide some of code your have tried. – Vinay Jul 01 '15 at 17:12
  • this is interesting, please give more details about the files, each line and conditions, also the target database structure, server details – Alex Andrei Jul 01 '15 at 17:19
  • I just edit my question you can check it. – Aadil Afzal Jul 01 '15 at 18:40
  • Use a unix socket (significanly less overhead than http(s)). use a transaction. use prepared statement. in php, PDO is faster than MySQLi iirc, so use PDO. check this http://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite , even though its for SQLite, much of it is valid for MySQL too. – hanshenrik Jul 01 '15 at 18:51
  • @hanshenrik From the looks of it, he's using a windows machine :/ – Populus Jul 01 '15 at 18:52
  • @Populus oh, then add Linux to the list of optimizations ;) – hanshenrik Jul 01 '15 at 18:53
  • this may be relevant: http://dba.stackexchange.com/questions/31752/mysql-load-data-infile-slows-by-80-after-a-few-gigs-of-input-with-innodb-engine – Populus Jul 01 '15 at 18:57
  • @hanshenrik the issue is that how can i fast the procedure of checking data, when i put conditions and check then it slow down the whole process. Currently by using this it store data in 5min, But when add PHP condition it dead slow. – Aadil Afzal Jul 01 '15 at 19:00
  • maybe look into multithreading it (PHP has excelent threading support these days with pthreads), or rewriting this heavy opreation in a fast language, like C (or assembly ;) ) lets say you make cpus+1 worker threads, and assing chunks of records to each of these workers~? – hanshenrik Jul 01 '15 at 19:11

0 Answers0