5

I have a relatively simply routine that looks at database entries for media files, calculates the width, height and filesize, and writes them back into the database.

The database is SQLite, using the System.Data.SQLite library, processing ~4000 rows. I load all rows into an ADO table, update the rows/columns with the new values, then run adapter.Update(table); on it.

Loading the dataset from the db tables half a second or so, updating all the rows with image width/height and getting the file length from FileInfo took maybe 30 seconds. Fine.

The adapter.Update(table); command took somewhere in the vicinity of 5 to 7 minutes to run.

That seems awfully excessive. The ID is a PK INTEGER and thus - according to SQLite's docs, is inherently indexed, yet even so I can't help but think that if I were to run a separate update command for each individual update, this would have completed much faster.

I had considered ADO/adapters to be relatively low level (as opposed to ORMs anyway), and this terrible performance surprised me. Can anyone shed some light on why it would take 5-7 minutes to update a batch of ~4000 records against a locally placed SQLite database?

As a possible aside, is there some way to "peek into" how ADO is processing this? Internal library stepthroughs or...??

Thanks

public static int FillMediaSizes() {
        // returns the count of records updated

        int recordsAffected = 0;

        DataTable table = new DataTable();
        SQLiteDataAdapter adapter = new SQLiteDataAdapter();

        using (SQLiteConnection conn = new SQLiteConnection(Globals.Config.dbAppNameConnectionString))
        using (SQLiteCommand cmdSelect = new SQLiteCommand())
        using (SQLiteCommand cmdUpdate = new SQLiteCommand()) {

            cmdSelect.Connection = conn;
            cmdSelect.CommandText =
                "SELECT ID, MediaPathCurrent, MediaWidth, MediaHeight, MediaFilesizeBytes " +
                "FROM Media " +
                "WHERE MediaType = 1 AND (MediaWidth IS NULL OR MediaHeight IS NULL OR MediaFilesizeBytes IS NULL);";

            cmdUpdate.Connection = conn;
            cmdUpdate.CommandText =
                "UPDATE Media SET MediaWidth = @w, MediaHeight = @h, MediaFilesizeBytes = @b WHERE ID = @id;";

            cmdUpdate.Parameters.Add("@w", DbType.Int32, 4, "MediaWidth");
            cmdUpdate.Parameters.Add("@h", DbType.Int32, 4, "MediaHeight");
            cmdUpdate.Parameters.Add("@b", DbType.Int32, 4, "MediaFilesizeBytes");
            SQLiteParameter param = cmdUpdate.Parameters.Add("@id", DbType.Int32);
            param.SourceColumn = "ID";
            param.SourceVersion = DataRowVersion.Original;

            adapter.SelectCommand = cmdSelect;
            adapter.UpdateCommand = cmdUpdate;

            try {
                conn.Open();
                adapter.Fill(table);
                conn.Close();
            }
            catch (Exception e) {
                Core.ExceptionHandler.HandleException(e, true);
                throw new DatabaseOperationException("", e);
            }

            foreach (DataRow row in table.Rows) {

                try {

                    using (System.Drawing.Image img = System.Drawing.Image.FromFile(row["MediaPathCurrent"].ToString())) {

                        System.IO.FileInfo fi;

                        fi = new System.IO.FileInfo(row["MediaPathCurrent"].ToString());

                        if (img != null) {

                            int width = img.Width;
                            int height = img.Height;
                            long length = fi.Length;

                            row["MediaWidth"] = width;
                            row["MediaHeight"] = height;
                            row["MediaFilesizeBytes"] = (int)length;
                        }
                    }
                }
                catch (Exception e) {
                    Core.ExceptionHandler.HandleException(e);
                    DevUtil.Print(e);
                    continue;
                }
            }                


            try {
                recordsAffected = adapter.Update(table);
            }
            catch (Exception e) {
                Core.ExceptionHandler.HandleException(e);
                throw new DatabaseOperationException("", e);
            }


        }

        return recordsAffected;
    }
jleach
  • 7,410
  • 3
  • 33
  • 60
  • [Debug .NET Framework Source Code in Visual Studio 2012?](http://stackoverflow.com/questions/15185725/debug-net-framework-source-code-in-visual-studio-2012) answers: *is there some way to "peek into" how ADO is processing this?* – Liam Jul 08 '15 at 08:37
  • 1
    _[Improve INSERT-per-second performance of SQLite?](http://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite)_ Though it's from the point of view of `c`, `sqlite` remains a constant –  Jul 08 '15 at 08:38

2 Answers2

5

Use Connection.BeginTransaction() to speed up the DataAdapter update.

conn.Open() 'open connection
Dim myTrans As SQLiteTransaction
myTrans = conn.BeginTransaction() 
'Associate the transaction with the select command object of the DataAdapter
objDA.SelectCommand.Transaction = myTrans 

objDA.Update(objDT)

Try
    myTrans.Commit()
Catch ex As Exception
    myTrans.Rollback()
End Try
conn.Close()

This vastly speeds up the update.

Mark
  • 1,360
  • 3
  • 20
  • 37
2

Loading the dataset from the db tables half a second or so

This is a single SQL statement (so it's fast). Excute SQL SELECT, populate the dataset, done.

updating all the rows with image width/height and getting the file length from FileInfo took maybe 30 seconds. Fine.

This is updating the in memory data (so that's fast too), change x row in the dataset, don't talk to SQL at all.

The adapter.Update(table); command took somewhere in the vicinity of 5 to 7 minutes to run.

This will run a SQL update for every updated row. Which is why it's slow.

yet even so I can't help but think that if I were to run a separate update command for each individual update, this would have completed much faster.

This is basically what it's doing anyway!


From MSDN

The update is performed on a by-row basis. For every inserted, modified, and deleted row, the Update method determines the type of change that has been performed on it (Insert, Update or Delete). Depending on the type of change, the Insert, Update, or Delete command template executes to propagate the modified row to the data source. When an application calls the Update method, the DataAdapter examines the RowState property, and executes the required INSERT, UPDATE, or DELETE statements iteratively for each row, based on the order of the indexes configured in the DataSet.


is there some way to "peek into" how ADO is processing this?

Yes: Debug .NET Framework Source Code in Visual Studio 2012?

Community
  • 1
  • 1
Liam
  • 27,717
  • 28
  • 128
  • 190
  • Thank you Liam. Apologies in that I should have been able to find this MSDN reference myself before asking. I had expected some sort of internal optimization, and clearly that was wrong! – jleach Jul 08 '15 at 08:49