1

I have to read records from CSV file and store them in Mysql database.

I know about "LOAD DATA INFILE" but in my case I have to get single record from file, check if it is in valid format/length etc and then store it in database.

// list to store records from CSV file
ArrayList<String> list = new ArrayList<String>();

//Read one line at a time
while ((nextLine = reader.readNext()) != null) 
{
   for (String number : nextLine) 
   {
      if (number.length() > 12 && number.startsWith("88"))
      {        
         list.add(number);
      } else if (number.length() > 9 && number.startsWith("54")) 
      {
         list.add(number);
      }
      else if (number.length() > 8 && number.startsWith("99"))
      {
         list.add(number);
      }
      else
      {
        // ....
      }

      // method to insert data in database
      insertInToDatabase(list);                     
   }
}

and method to insert record in db: taken from here

private void insertInToDatabase(ArrayList<String> list) 
{
   try
   {
      String query = "INSERT INTO mytable(numbers) VALUES(?)";

        prepStm = conn.prepareStatement(query);

        for (String test : list) 
        {
            prepStm.setString(1, test);

            prepStm.addBatch();// add to batch
            prepStm.clearParameters();
        }

        prepStm.executeBatch();
    }
}

This is working, but the rate at which the records are inserting is very slow. is there any way by which I can insert records faster.

Community
  • 1
  • 1
pro_newbie
  • 336
  • 2
  • 6
  • 19
  • Can you give more details? How many lines does the CSV approx. contain? Did you check how long the reading of the file/the writing to the DB did take? – eiselems Feb 25 '15 at 06:53
  • right now, for testing the CSV file contains 90k records and it took approximately 45 minutes to insert all records. – pro_newbie Feb 25 '15 at 07:17
  • I mean did you test how long the reading vs the inserting needed? You could just insert three System.out.println(System.currentTimeMillis()). One before the reading starts, one after the reading finishes and one after the inserting into the db is done By this way you could determine if it is really the inserting into the db (or the parsing itself) that is so slow – eiselems Feb 25 '15 at 07:20

4 Answers4

1

You would need to use: "rewriteBatchedStatement" as that is a MYSQL optimization which attempts to reduce round trips to the server by consolidating the inserts or updates in as few packets as possible.

Please refer to: https://anonymousbi.wordpress.com/2014/02/11/increase-mysql-output-to-80k-rowssecond-in-pentaho-data-integration/

Also, there are other optimizations as well in that article. Hope this speed up the batching.

EDIT 1: There is a lucid explanation of this parameter on this site as well: refer to: MySQL and JDBC with rewriteBatchedStatements=true

Community
  • 1
  • 1
Khanna111
  • 3,627
  • 1
  • 23
  • 25
  • I tried adding `rewriteBatchedStatements=true`, it makes the insertion very fast, in seconds all the records are inserted. But I have an unique index on the field, so this gives me an exception. [http://bugs.mysql.com/bug.php?id=6822]. Any way around on that? – pro_newbie Feb 25 '15 at 13:13
  • That would be another question and people would be able to answer it once you ask it. That is how the site works. You could resolve this and ask another one. – Khanna111 Feb 25 '15 at 18:07
  • the link http://anonymousbi.wordpress.com/2014/02/11/increase-mysql-output-to-80k-rowssecond-in-pentaho-data-integration/ is dead – ng.newbie Feb 08 '22 at 09:59
0

i think the better approach is to process the csv file with the rules defined and then create another csv out it, and once the output csv is prepared. do load data infile.

it'll be pretty quick.

anurag gupta
  • 379
  • 1
  • 5
0

If you want to insert through your own application create batch query like this and execute to MySQL server.

String query = "INSERT INTO mytable(numbers) 
                VALUES (0),
                       (1),
                       (2),
                       (3)";
Sujit Rai
  • 445
  • 6
  • 10
0

@Khanna111's answer is good.

I don't know if it helps, but try checking the table engine type. I once encountered the problem in which records are inserting very slow. I changed the engine from InnoDB to MyISAM and insertion becomes very fast.

nice guy
  • 151
  • 1
  • 14