1

I am trying to insert a large text file into oracle database, my current program works but it is loading very slow. The text file is around 400 MB

What I did was like below...

...
ArrayList<String> nta = new ArrayList<String>();
while ((line = br.readLine()) != null) {
//add data position for each line read
data.add(line.substring(0, 6)+
"="+line.substring(6, 24)+
"="+line.substring(24, 30)+
"="+line.substring(30, 48)); 

}
db.insertRecord(data);
...

public void insertRecord(ArrayList<String> data) {
String sql = "Insert into Account (NAME, ID, RCBS, CA_NUM, GUID, PARN_GUID)"+
             " values "
             "(?,?,?,?,?,?)";
...
ps = con.prepareStatements(sql);
for(int i=0; i<data.size(); i++) {
    String[] fields = data.get(i).split("=");
            ps.setString(1, fields[0]);
            ps.setString(2, fields[1]); 
            ps.setString(3, fields[2]);
            ps.setString(4, fields[3]); 
            ps.setString(5, fields[0].trim()+"."+fields[1].trim()+"."+fields[2].trim()+"."+fields[3].trim()); //Index
            ps.setString(6, fields[0].trim()+"."+fields[1].trim()); //PARN Index
            ps.execute();
} //end loop
con.commit();
...

Are there any performance can be done to increase the speed of the data load?

logger
  • 1,983
  • 5
  • 31
  • 57
  • 1
    Use batch inserts. – CrazySabbath Oct 27 '17 at 14:15
  • 1
    Please stop and do it right way. For example by using SQL*Loader or external table. – Lukasz Szozda Oct 27 '17 at 14:18
  • The problem I am having is that there is a weird ascii character in the text file. when i use SQL loader it would load it as > in the table column – logger Oct 27 '17 at 14:21
  • Use @CrazySabbath 's approach, I had a similar issue. I broke into "chunks" - I actually built a asynchronous api (that would utilize multiple threads) to handle our document processing - a little overkill but the learning process was great haha – Tez Wingfield Oct 27 '17 at 14:24
  • Hi - what happens to the "weird ascii character" when inserted via java? I'd expect that you can speed this up by specifying the characterset in SQLLDR eg LOAD DATA CHARACTERSET MSWIN1252 – Niall Litchfield Oct 27 '17 at 15:19
  • For testing I recommend to comment line of `ps.execute()` then you can see whether the bottleneck is really at Oracle. – Wernfried Domscheit Oct 27 '17 at 16:55

2 Answers2

1

As suggested, you should use batch inserts when dealing with that many records. Note that this piece of code will execute inserts once every 1000 (not all at once) to avoid possible memory problems, batchInterval.

int batchInterval = 1000;

ps = con.prepareStatements(sql);
for(int i=0; i<data.size(); i++) {
        String[] fields = data.get(i).split("=");
        ps.setString(1, fields[0]);
        ps.setString(2, fields[1]);
        ps.setString(3, fields[2]);
        ps.setString(4, fields[3]);
        ps.setString(5, fields[0].trim()+"."+fields[1].trim()+"."+fields[2].trim()+"."+fields[3].trim()); //Index
        ps.setString(6, fields[0].trim()+"."+fields[1].trim()); //PARN Index
        ps.addBatch();

        if (i % batchInterval == 0) ps.executeBatch();
}

ps.executeBatch();
ps.close();
con.close();

You can find more here: batch inserts tutorial

CrazySabbath
  • 1,274
  • 3
  • 11
  • 33
-1

send it to server as CLOB (many-many row in one portion) and parse on server side..

Needle file
  • 508
  • 3
  • 7
  • 1
    Are you suggesting him to parse 400MB size CLOBs on oracle data base? – CrazySabbath Oct 27 '17 at 14:17
  • why not? portions 32 KB. very effectively instead of the 100,000 individual inserts. this way I am monthly import up to 10 GB of csv files in the production database – Needle file Oct 27 '17 at 14:20