2

I have a map with huge amount of data being populated (around 300,000 records in approx)

and iterating it as below ,

    for (Map.Entry<String, List<ClassOBj>> entry : testMap
                    .entrySet()) {
 // send email with map keys as email'id
 // email content is populated from the list
// Perform a sql update to the column with the dyanamic value generated here with the email'id
                                  }

As mentioned above , i am worried about the performance issues that will be caused of the above operation that is performed inside the for loop.

Update:

the scenario is . i am iterating a map which holds large amount of data,

On iterating it i am getting the userid's and i have to make a computation of the user id .For example, consider userid+some constants and this should be updated in the database table.

and also should be added to the email content along with list values from my map

so i thought batch updates are not possible, am i correct with my understanding ?

should i follow this approach ? or go with any alternate ideas

Santhosh
  • 8,181
  • 4
  • 29
  • 56
  • Ideal scenario is not to make costly resource calls - like DB calls in loops. Use intermediate collections to store data and use bulk update/ bulk delete. If it is with `Hibernate`, do `flush` after loop. – Vinay Veluri Sep 25 '14 at 11:12
  • thanks for your suggestion . As i have explained the operations performed . you could see that bulk updates cant be performed as generate dyanamic values here – Santhosh Sep 25 '14 at 11:13
  • Create a map with key to store email and value - if u need any dynamic value. Instead of doing SQL update, add to map(although it is dynamic, all the values which can be updated will be in map). You can consider the map values which can be updated directly. – Vinay Veluri Sep 25 '14 at 11:17
  • updated my question . hope it is clear now – Santhosh Sep 25 '14 at 11:25
  • The loop do 2 things? 1 send email and 2 update in DB? can u add some dummy example of what you are doing in loop? – maaz Sep 29 '14 at 10:47
  • updated it in my question .kindly see the update part . – Santhosh Sep 29 '14 at 11:30

6 Answers6

2

Instead of updating the database in each loop.Try to update the database after completing the loop.

There are different way to optimize the large database update. Best one of that is

  • Insert modified values into a temporary table
  • Update the original table from the temporary table

Also use a thread based queuing mechanism to send your emails

Prasanth V J
  • 1,126
  • 14
  • 32
  • thanks for your answer . i have updated my question . hope it is clear now . can you please help with the high level design – Santhosh Sep 25 '14 at 11:28
  • writing to a temporary table will still hamper the perf. as it involves the DB calls and additionally copying data into original tables. This is not what is required. – Nachiket Kate Sep 25 '14 at 13:44
  • @NachiketKate thanks for your suggestion . feel free to add an answer if you have an answer :) – Santhosh Sep 25 '14 at 14:02
  • I think you have some options which are spread across given answers. – Nachiket Kate Sep 25 '14 at 14:11
2

The for loop is taking time due to two reasons.
1) Individual Email improve it by less Transport connection
2) Individual commits improve it by

So Ideal is to handle both, I would recommend do it for batch of 1000, then play with numbers

Example

int BATCH_SIZE = 1000
conn = DriverManager.getConnection("username","password");
conn.setAutoCommit(false);
Statement stmt = conn.createStatement(
        ResultSet.TYPE_SCROLL_INSENSITIVE,
        ResultSet.CONCUR_UPDATABLE);
int count = 0;

Map<String, String> emails_map = new HashMap(BATCH_SIZE)<String, String>;
for (Map.Entry<String, List<ClassOBj>> entry : testMap
        .entrySet()) {
    String email = get_useremail();
    String const_val = do_magic(); // this is how you are computing some constant
    String body = construct_body();

    count++;
    String SQL = "YOUR UPDATE STATEMENT";
    stmt.executeUpdate(SQL);  
    emails_map.put(email, body); // can create 
    if (count % BATCH_SIZE == 0) {
        //commits all transcations
        conn.commit();
        //bulk send emails sending 
        //http://stackoverflow.com/questions/13287515/how-to-send-bulk-mails-using-javax-mail-api-efficiently-can-we-use-reuse-auth

        bulk_emails_send(emails_map)
    }

}


public void bulk_emails_send(Map<String, String> emails_map) {
    // Get the default Session object through your setting
    Session session = Session.getDefaultInstance(properties);
    Transport t = session.getTransport();
    t.connect();
    try {
        for (String email_id in emails_map) {
            Message m = new MimeMessage(session);
            //add to, from , subject, body
            m.saveChanges();
            t.sendMessage(m, m.getAllRecipients());
        }
    } finally {
        t.close();
    }
}
Byungjoon Lee
  • 913
  • 6
  • 18
maaz
  • 4,371
  • 2
  • 30
  • 48
  • what is `do_magic()` method and can you also elaborate sending bulk mail and batch update – Santhosh Oct 01 '14 at 12:43
  • do_magic its how you are calculating constant, bulk/batch update was already there, also added bulk email – maaz Oct 01 '14 at 16:20
1

My suggestion is you can use Stored procedure. or use can use batch wise update instead of this.

More about sql batch update.

Ruchira Gayan Ranaweera
  • 34,993
  • 17
  • 75
  • 115
1

You should use jdbc batch update feature.

While iterating over map you add batch to your prepared statement. When you has added (say) 2000 records, you call stmt.batchUpdate() which will update 2000 diferent records in fast way.

Some example you can see here:

http://www.mkyong.com/jdbc/jdbc-preparedstatement-example-batch-update/

Second thing - If you can, make transaction commit after each batchUpdate. Transaction for 300k records may be to much for your database configuration. Splitting this update into many transactions will have better performance - but only if you can not to have transaction on all records.

przemek hertel
  • 3,924
  • 1
  • 19
  • 27
1

I would make something like this. Prepare the data for the operation.

I suppose you are updating a table like user which should have a unique Id.

Map<String, String> emailIds = new HashMap<String, String>();
Map<String, String> emails = new HashMap<String, String>();
for (Map.Entry<String, List<ClassOBj>> entry : testMap.entrySet()) {
 -- DONOT DO THIS// send email with map keys as email'id
 -- DONOT DO THIS// email content is populated from the list
 -- DONOT DO THIS// Perform a sql update to the column with the dyanamic value generated here with the email'id
emails.put(emailId, content);
emailIds.put(id, emailId);

}

bulkEmailSend(emails);
bulkEmailUpdate(emailIds);

bulkEmailSend and bulkEmailUpdate are the methods which should be written to make the appropriate calls.

So, use the bulk email send and bulk emailId update methods to update the values back to the database.

Vinay Veluri
  • 6,671
  • 5
  • 32
  • 56
1

I'll try to summarize all good points mentioned above.

Your options are,

  1. Use multithreading wherever possible but keep it mind that multithreading comes with a cost of additional memory(heapdumps and application will be down).
  2. Another good option is use bulk update but again bulk update comes with cost of more lock time at database end. so use it wisely as threads will be waiting till one thread finishes with updating as update will take exclusive lock i.e. no sharing possible.

  3. Try threading in along with async jobs (specially for mails) try to use different server/process for mail handling and send mail requests to mail server asynchronously which is quite fast as your mail generation task is handled by other process (In real life mail exchange can accept some delay thus your slow mail server is allowed but application server is not.)

  4. If possible move your complete Db update logic in stored procedure on DB server which will save your lot of time (Rule of Thumb : Always let A handle tasks for which A is designed/optimized i.e. DB are designed for faster DB operations than pgming languages.)

Couldn't give you answer specific to your need but I hope this was helpful in terms of improving it :)

Nachiket Kate
  • 8,473
  • 2
  • 27
  • 45