2

In my application I fetch some data from my server and after deserialization (from JSON to objects) I'm going to put these objects in my database. Until today I had approach:

for(int i=0; i<receivedJsonArray.length; i++)
    Bean bean = new Bean();
    //matching some parameters
    //...
    dao.putObject(bean);
 }

Where my dao.putObject(Bean) was like (I ommited a few things like try-catch block as they were not really relevant).

 public void putObject(Bean bean){
     sqliteDatabase.beginTransaction();
     ContentValues values = new ContentValues();
     values.put("something", bean.getSomething());
     // ... some mapping
     database.insert("table", null, values);
     database.setTransactionSuccessful();
     database.endTransaction();
 }

As you can see here every time I deserialize a single instance of object I'm making a new transaction and so on. However, I feel I use a fraction of memory I would use in the second approach.

Bean[] beans = new Bean[receivedJsonArray.length]; 
for(int i=0; i<receivedJsonArray.length; i++)
    Bean bean = new Bean();
    //matching some parameters
    //...
    beans[i] = bean;
}
dao.putObject(beans);

Now dao.putObject(Bean...) looks like

 public void putObject(Bean... beans){
     sqliteDatabase.beginTransaction();
     for(int i=0; i<beans.length; i++){
          ContentValues values = new ContentValues();
          values.put("something", beans[i].getSomething());
          // ... some mapping
          database.insert("table", null, values);
     }
     database.setTransactionSuccessful();
     database.endTransaction();
 }

So now I wonder which of these two is a better way to handle about ~400-500 objects? I feel that beginning and ending transaction 400-500 times is a bad practice. On the other hand I feel bad for creating and holding quite big collection of data.
I'm aware in devices I targeted for my application (Android 4.+) this does not really matter as I have both efficient CPUs and quite much RAM; however I feel this is not an excuse for non-efficient code.

So, how do you suggest? One big batch, inserting every single object or maybe trying to insert a bunch of (i.e.) 100 objects at a time?

spoko
  • 783
  • 1
  • 10
  • 24
  • 2
    Whether one transaction or multiple transactions are used depends on how the consistency of the batch should be treated. What happens if one item in the batch could not be inserted? Should the currently inserted be rolled back? If yes then only one transaction for the batch should be used. The decision about holding the whole batch in memory is completely orthogonal IMHO. – Diego Apr 15 '15 at 23:13
  • Agree with @Diego, it's more about transaction policy (atomic operations), not performance. – Alex Salauyou Apr 15 '15 at 23:20
  • 1
    That being said, there _is_ an important performance difference between using a single transaction or one for each record. See for example http://stackoverflow.com/q/3501516/82788 – matiash Apr 16 '15 at 05:11
  • Committing 500 transaction will be much too slow. Did you never measure the time? – CL. Apr 16 '15 at 08:06
  • @CL. I still don't have 500 objects to test it, I'm currently refactoring my code and the first approach I presented in my question bothered me as a source of probable inefficiency. @matiash thanks for that link, accepted answer is pretty clear that one big batch is a better idea for fast solution. `android.database.DatabaseUtils.InsertHelper` also looks promising and I'm going to use that approach, as it's fine to rollback everything if there is a problem - application won't produce anything useful if data is corrupted somehow (like missing rows) – spoko Apr 16 '15 at 11:35
  • @CL. I found a way to benchmark it somehow and results are totally explicit. I'm going to publish them in an answer. – spoko Apr 16 '15 at 12:11

1 Answers1

2

I measured the time and the results are pretty explicit.

DAO method used shown below:

public void addObject(Bean ... beans) {
    SQLiteDatabase database = dbHelper.openDatabase();
    database.beginTransaction();
    try{
        for(int i=0; i<roomBeans.length; i++) {
            ContentValues values = new ContentValues();
            //putting stuff in values
            database.insertWithOnConflict("table", null, values, SQLiteDatabase.CONFLICT_REPLACE);
        }
        database.setTransactionSuccessful();
    }catch(SQLException e) {
        Log.e(AppConstants.DEBUG_TAG, e.toString());
    }finally {
        database.endTransaction();
    }
    dbHelper.closeDatabase();
}

What I measured:

Approach #1:

        DAO.deleteRows(); // so I'm inserting into an empty table
        long time = new Date().getTime();
        for(int i=0; i<objectsArray.length()*50; i++){
            JSONArray entry = (JSONArray) objectsArray.get(i%objectsArray.length());

            Bean bean = new Bean();
            bean.setStuff(entry.getInt(0));
            // and so on...
            dao.addObject(bean);
        }

        long timeFinish = new Date().getTime();
        long timeTotal = timeFinish - time;

Approach #2:

        DAO.deleteRows(); // so I'm inserting into an empty table
        long time = new Date().getTime();
        Bean[] beans = new Bean[objectsArray.length()*50];
        for(int i=0; i<objectsArray.length()*50; i++){
            JSONArray entry = (JSONArray) objectsArray.get(i%objectsArray.length());

            Bean bean = new Bean();
            bean.setStuff(entry.getInt(0));
            // and so on...
            beans[i] = bean;
        }
        dao.addObject(beans);
        long timeFinish = new Date().getTime();
        long timeTotal = timeFinish - time;

Results:

Every approach was tested ten times, to make sure the results are somewhat reproducible.

Values of timeTotal for Approach #1
20628, 20964, 19699, 19380, 20299, 20553, 19715, 18239, 20267, 18711
Values of timeTotal for Approach #2
1289, 1333, 1288, 1388, 1294, 1393, 1344, 1334, 1376, 1360

Summary:
Numbers here speak louder than words. I don't know how to check the exact amount of memory used here, but even if I used 1-2 MB of RAM (which is IMO kinda extreme prediction given the structure of Bean I use) for 2 seconds, nothing wrong should happen assuming this Bean[] array would be a local variable. After leaving the scope of method it should be quite quickly GC'ed and everyone should be happy. Probably these results depend on the SQL logic of tables and data itself as well.

Feel free to leave a comment if you do not agree with me on that.

spoko
  • 783
  • 1
  • 10
  • 24