0

So basically I have 2 tables called snomed_conceptdata with (454772 rows) and
'snomed_descriptiondata' with (1383698 rows).According to this code currently I am trying to insert records into a table called snomedinfo_data, which is working but the insert/import process is happening where slowly.I suspect that this is because of the nested while loop which takes too long while processing.

Is there an alternative way to do this so that the import/insert process can happen quickly.

package Snomed.Snomed;

import java.sql.PreparedStatement;
import java.sql.ResultSet;

import catalog.Root;

public class Snomedinfo {
    public void snomedinfoinsert()
    {
    Root oRoot = null;
    ResultSet oRsSelect = null;
    PreparedStatement oPrStmt = null;
    PreparedStatement oPrStmt2 = null;
    PreparedStatement oPrStmtSelect = null;
    String strSql = null;
    String strSql2 = null;
    String snomedcode=null;
    ResultSet oRs = null;
    String refid = null;
    String id = null;
    String effectivetime = null;
    String active = null;
    String moduleid = null;
    String conceptid = null;
    String languagecode = null;
    String typeid = null;
    String term = null;
    String caseSignificanceid = null;
    try{
    oRoot = Root.createDbConnection(null);

    strSql = "SELECT  id FROM snomed_conceptdata WHERE active=1 ";
    oPrStmt2 = oRoot.con.prepareStatement(strSql);
    oRsSelect = oPrStmt2.executeQuery();
    while (oRsSelect.next()) {
        snomedcode = Root.TrimString(oRsSelect.getString("id"));

        strSql2 = "SELECT  * FROM snomed_descriptiondata WHERE conceptid =? AND active=1  ";
        oPrStmtSelect = oRoot.con.prepareStatement(strSql2);
        oPrStmtSelect.setString(1,snomedcode);
        oRs =oPrStmtSelect.executeQuery();
        while (oRs.next()) {
            refid = Root.TrimString(oRs.getString("refid")); 
            id = Root.TrimString(oRs.getString("id"));
            effectivetime = Root.TrimString(oRs.getString("effectivetime"));
            active = Root.TrimString(oRs.getString("active"));
            moduleid = Root.TrimString(oRs.getString("moduleid"));
            conceptid = Root.TrimString(oRs.getString("conceptid"));
            languagecode = Root.TrimString(oRs.getString("languagecode"));
            typeid = Root.TrimString(oRs.getString("typeid"));
            term = Root.TrimString(oRs.getString("term"));
            caseSignificanceid = Root.TrimString(oRs.getString("caseSignificanceid"));

            String sql = "INSERT INTO snomedinfo_data (refid,id,effectivetime,active,moduleid,conceptid,languagecode,typeid,term,caseSignificanceid) VALUES( ?, ?, ?,?,?,?,?,?,?,?)";
            oPrStmt = oRoot.con.prepareStatement(sql);
            oPrStmt.setString(1, refid);
            oPrStmt.setString(2, id);
            oPrStmt.setString(3, effectivetime);
            oPrStmt.setString(4, active);
            oPrStmt.setString(5, moduleid);
            oPrStmt.setString(6, conceptid);
            oPrStmt.setString(7, languagecode);
            oPrStmt.setString(8, typeid );
            oPrStmt.setString(9, term);
            oPrStmt.setString(10, caseSignificanceid);
             oPrStmt.executeUpdate();

        }

    }
    System.out.println("done");
    }

    catch (Exception e) {
        e.printStackTrace();

    }

    finally {

        oRsSelect = Root.EcwCloseResultSet(oRsSelect);
        oRs = Root.EcwCloseResultSet(oRs);
        oPrStmt = Root.EcwClosePreparedStatement(oPrStmt);
        oPrStmt = Root.EcwClosePreparedStatement(oPrStmt2);
        oPrStmt = Root.EcwClosePreparedStatement(oPrStmtSelect);
        oRoot = Root.closeDbConnection(null, oRoot);
    }
}
    public static void main(String args[] ) throws Exception 
      { 


      Snomedinfo a = new Snomedinfo();
      a .snomedinfoinsert();

      }


}
The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • 1
    Generally it works like this with a single query call: `insert into t (col1, col2) select colX, colY from other_table` – juergen d Aug 11 '18 at 11:22
  • Try batch updates. it'll be much faster if you execute more than one update in a single transaction. – J.Doe Aug 11 '18 at 11:59
  • how do i avoid using those nested while loops?? is there a better alternative?? –  Aug 11 '18 at 12:11
  • Do you have to use Java? Is it a must? You can do this very fast with MySQL dump and load: https://dev.mysql.com/doc/refman/8.0/en/load-data.html – rodrigoap Aug 11 '18 at 12:53
  • yes i have to do this in java jdbc only –  Aug 11 '18 at 12:54

2 Answers2

3

You can use SQL Insert Into Select Query

INSERT INTO snomedinfo_data 
SELECT refid,id,effectivetime,active,moduleid,conceptid,languagecode,typeid,term,caseSignificanceid FROM snomed_descriptiondata 
WHERE conceptid =? AND active=1  

it will merge your 2nd and 3rd query, so that you can avoid unnecessary iterations.

Jivesh Pednekar
  • 106
  • 1
  • 5
1

Batch inserts are mentioned in the comments (we will come to that in a minute), here are some other optimizations you can do.

At the moment you're selecting the data used for the insert one by one, so if you insert 100,000 rows, you're also selecting 100,000 rows from the database. Instead you can perform one single SQL-statement doing a join over the two tables:

...
    strSql2 = "SELECT d.* FROM snomed_conceptdata c, snomed_descriptiondata d WHERE c.active = 1 and conceptid = c.id AND d.active = 1";
    oPrStmtSelect = oRoot.con.prepareStatement(strSql2);
    oPrStmtSelext.setFetchSize(100);
    oRs = oPrStmtSelect.executeQuery();
    while (oRs.next()) {
...

Remove the first select statement and while-loop and you should already have a significant increase in performance.

The next increase in performance can come from preparing the insert-statement only once. So put

        String sql = "INSERT INTO snomedinfo_data (refid,id,effectivetime,active,moduleid,conceptid,languagecode,typeid,term,caseSignificanceid) VALUES(?,?,?,?,?,?,?,?,?,?)";
        oPrStmt = oRoot.con.prepareStatement(sql);

outside the while-loop. The increase shouldn't be that dramatic like the one when you get rid off all these select-statements, but still.

And finally, instead of calling oPrStmt.executeUpdate(); you can add the execution to a batch by calling oPrStmt.addBatch(). This collects the statements in memory and by calling oPrStmt.executeBatch() all the statements are executed within one transaction. If too many statements are added to the batch, you will end in an OutOfMemoryError, so you might keep count of the added statements and execute a batch after reaching a threshold of e.g. 1000.

Lothar
  • 5,323
  • 1
  • 11
  • 27
  • Also important with batched inserts to MySQL is [rewriteBatchedStatements=true](https://stackoverflow.com/a/26313288/2144390). – Gord Thompson Aug 11 '18 at 13:08
  • is it neccesary to add an index to the "conceptid" column of "snomed_descriptiondata" table –  Aug 11 '18 at 13:16
  • @JohnSmith Independent from these suggestions: Yes, otherwise you'll have a full table scan if you select data from the table. – Lothar Aug 11 '18 at 13:18
  • so just adding an index is enough?? –  Aug 11 '18 at 13:19
  • @JohnSmith Enough for what? Avoiding above changes? No. If you keep the selection row by row, you still have potentially hundreds of thousands of SQL-statements against the database instead of one. – Lothar Aug 11 '18 at 13:21
  • hi the oPrStmtSelext.setFetchSize(100); doesnt work..iam currently using mysql database –  Aug 11 '18 at 14:03
  • getting a out of memory heap space error!! –  Aug 11 '18 at 14:04
  • @JohnSmith If I understand https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html correctly you have to add `useCursorFetch=true` to your connection URL. – Lothar Aug 12 '18 at 09:13