0

I am using an SQLite database, and inserting records into it. This takes a hugely long time! I have seen people who say they can process a couple thousand in a minute. I have around 2400 records. Each record takes 30s-2m to complete. Recreating the database is not an option. I have tried to create one transaction different ways. I need to use the timer, because I am using a ProgressBar to show me that something is happening. Here is the code I am using:

string con;
con = string.Format(@"Data Source={0}", documentsFolder);

SQLiteConnection sqlconnection = new SQLiteConnection(con);
SQLiteCommand sqlComm = sqlconnection.CreateCommand();
sqlconnection.Open();
SQLiteTransaction transaction = sqlconnection.BeginTransaction();

Timer timer2 = new Timer();
timer2.Interval = 1000;
timer2.Tick += (source, e) =>
                    {
                        URL u = firefox.URLs[count2];
                        string newtitle = u.title;
                        form.label1.Text = count2 + "/" + pBar.Maximum;
                        string c_urls = "insert or ignore into " + table + " (id,
 url, title, visit_count, typed_count, last_visit_time, hidden) values (" + dbID + ",'" + u.url + "','" 
    + newtitle + "',1,1, " + ToChromeTime(u.visited) + ", 0)";
                        string c_visited = "insert or ignore into " + table2 + " (id,
 url, 
    visit_time, transition) values (" + dbID2 + "," + dbID + "," + 
ToChromeTime(u.visited) + ",805306368)";
                        sqlComm = new SQLiteCommand(c_urls, sqlconnection);
                        sqlComm.ExecuteNonQuery();
                        sqlComm = new SQLiteCommand(c_visited, sqlconnection);
                        sqlComm.ExecuteNonQuery();

                        dbID++;
                        dbID2++;


                        pBar.Value = count2;
                        if (pBar.Maximum == count2)
                        {
                            pBar.Value = 0;
                            timer.Stop();
                            transaction.Commit();
                            sqlComm.Dispose();
                            sqlconnection.Dispose();
                            sqlconnection.Close();
                        }

                        count2++;
                    };
timer2.Start();

What am I doing wrong?

Nasreddine
  • 36,610
  • 17
  • 75
  • 94
CC Inc
  • 5,842
  • 3
  • 33
  • 64
  • Have you tried to run the insert command against the database directly (rather than through C#)? How many items are in the table you're inserting into? What does its schema look like (fields + keys + indexes)? Do you have any triggers on this table? – StriplingWarrior Jul 19 '12 at 22:59
  • Ok, I am inserted into a standard Chrome history DB. I am inserting around 2400 records. – CC Inc Jul 19 '12 at 23:04
  • @StriplingWarrior asked you specific questions (4 of them), none of which you addressed in your reply. "a standard Chrome history DB" doesn't answer "Have you tried running against the DB directly?", "How many items are in the table?" (which is different than "how many items are you inserting), or provide any information to those of use who aren't familiar with "standard Chrome history DB" schemas, but are familiar with running SQLite insert statements. – Ken White Jul 19 '12 at 23:09
  • Sorry. I cannot try directly, because the values are in bulk, from another database. Before I insert into the database, I clear the tables urls and visits, so that would be 0 values. Here is the code I found to create the urls table: `CREATE TABLE urls(id INTEGER PRIMARY KEY,url LONGVARCHAR,title LONGVARCHAR,visit_count INTEGER DEFAULT 0 NOT NULL, typed_count INTEGER DEFAULT 0 NOT NULL,last_visit_time INTEGER NOT NULL,hidden INTEGER DEFAULT 0 NOT NULL, favicon_id INTEGER DEFAULT 0 NOT NULL); CREATE INDEX urls_favicon_id_INDEX ON urls (favicon_id); CREATE INDEX urls_url_index ON urls (url);` – CC Inc Jul 19 '12 at 23:23
  • And here is the code for the visits table: `CREATE TABLE visits(id INTEGER PRIMARY KEY,url INTEGER NOT NULL,visit_time INTEGER NOT NULL,from_visit INTEGER,transition INTEGER DEFAULT 0 NOT NULL,segment_id INTEGER,is_indexed BOOLEAN);CREATE INDEX visits_from_index ON visits (from_visit); CREATE INDEX visits_time_index ON visits (visit_time); CREATE INDEX visits_url_index ON visits (url);` – CC Inc Jul 19 '12 at 23:25
  • Duplicate of http://stackoverflow.com/questions/3852068/sqlite-insert-very-slow – david_p Dec 04 '14 at 09:20

3 Answers3

2

This is what I would address, in order. It may or may not fix the problem, but it won't hurt to see (and it might just do some magic):

  1. Ensure the Database is not being contended with updates (from another thread, process, or even timer!). Writers will acquire locks and unclosed/over-long-running transactions can interact in bad ways. (For updates that take "30 seconds to 2 minutes" I would imagine there is an issue obtaining locks. Also ensure the media the DB is on is sufficient, e.g. local drive.)

  2. The transaction is not being used (??). Move the transaction inside the timer callback, attach it to the appropriate SQLCommands, and dispose it before the callback ends. (Use using).

  3. Not all SQLCommand's are being disposed correctly. Dispose each and every one. (The use of using simplifies this. Do not let it bleed past the callback.)

  4. Placeholders are not being used. Not only is this simpler and easier to use, but it is also ever so slightly more friendly to SQLite and the adapter.

(Example only; there may be errors in the following code.)

// It's okay to keep long-running SQLite connections.
// In my applications I have a single application-wide connection.
// The more important thing is watching thread-access and transactions.
// In any case, we can keep this here.
SQLiteConnection sqlconnection = new SQLiteConnection(con);
sqlconnection.Open();

// In timer event - remember this is on the /UI/ thread.
// DO NOT ALLOW CROSS-THREAD ACCESS TO THE SAME SQLite CONNECTION.
// (You have been warned.)
URL u = firefox.URLs[count2];
string newtitle = u.title;
form.label1.Text = count2 + "/" + pBar.Maximum;

try {
   // This transaction is ONLY kept about for this timer callback.
   // Great care must be taken with long-running transactions in SQLite.
   // SQLite does not have good support for (long running) concurrent-writers
   // because it must obtain exclusive file locks.
   // There is no Table/Row locks!
   sqlconnection.BeginTransaction();
   // using ensures cmd will be Disposed as appropriate.
   using (var cmd = sqlconnection.CreateCommand()) {
     // Using placeholders is cleaner. It shouldn't be an issue to
     // re-create the SQLCommand because it can be cached in the adapter/driver
     // (although I could be wrong on this, anyway, it's not "this issue" here).
     cmd.CommandText = "insert or ignore into " + table
       + " (id, url, title, visit_count, typed_count, last_visit_time, hidden)"
       + " values (@dbID, @url, 'etc, add other parameters')";
     // Add each parameter; easy-peasy
     cmd.Parameters.Add("@dbID", dbID);
     cmd.Parameter.Add("@url", u.url);
     // .. add other parameters
     cmd.ExecuteNonQuery();
   }
   // Do same for other command (runs in the same TX)
   // Then commit TX
   sqlconnection.Commit();
} catch (Exception ex) {
   // Or fail TX and propagate exception ..
   sqlconnection.Rollback();
   throw;
}

if (pBar.Maximum == count2)
{
    pBar.Value = 0;
    timer.Stop();
    // All the other SQLite resources are already
    // cleaned up!
    sqlconnection.Dispose();
    sqlconnection.Close();
}
  • Could you give me an example? – CC Inc Jul 20 '12 at 22:04
  • @CCInc Updated with a "I just typed it up" example. –  Jul 20 '12 at 23:38
  • @CCInc Also, verify the *other* database uses to make sure long-running transactions are not causing [minor] deadlocks .. –  Jul 20 '12 at 23:40
  • Works great, except I get this error: `The event 'System.Data.SQLite.SQLiteConnection.RollBack' can only appear on the left hand side of += or -= ` And the same for Commit – CC Inc Jul 21 '12 at 01:50
  • Odd, should be this `Rollback` (the [method](http://www.devart.com/dotconnect/sqlite/docs/Devart.Data.SQLite~Devart.Data.SQLite.SQLiteConnection~Rollback.html)) .. that error message makes it sounds like it is trying to attach to *events*? –  Jul 21 '12 at 02:51
  • In any case, if transaction are not needed, those lines can be commented out (BeginTransaction, Rollback, Commit) .. if transaction are needed, use the SQLiteTransaction object returned from SQLiteConnection.BeginTransaction; in this case make sure the pass it to the SQLiteCommand's as well (cmd.Transaction = theOpenTransaction). –  Jul 21 '12 at 02:58
  • Ok, I incorperated your code with this code [here](http://www.devart.com/dotconnect/sqlite/docs/Devart.Data.SQLite~Devart.Data.SQLite.SQLiteConnection~BeginTransaction().html), and I get the error 'Cannot access a disposed object'. [Here](http://pastebin.com/Ka4nQb06) is my code. – CC Inc Jul 21 '12 at 22:32
  • @CCInc The message is just what it says. An object that had `Dispose` invoked upon it was used later. In this case it is a SQLCommand object. See where I create a *new* SQLCommand in the `using`? That code re-uses the *same* command. :( Also, that code does not follow my code (but perhaps I could have been more specific) -- it opens a *new* SQLConnection each timer but only closes *one*. That is potentially a massive resource leak! Also better-scope the transaction variable name (it should be local to the event callback). –  Jul 21 '12 at 23:50
  • Ok, I have worked on the code a bit, and now it can process around a dozen a second, is there anything else I can do? [Here](http://pastebin.com/S1x84reb) is the code. – CC Inc Jul 22 '12 at 23:00
  • @CCInc Are you updating each one in a timer?? –  Jul 23 '12 at 00:03
  • I am not using a timer here, as it seems to crate to many problems. – CC Inc Jul 23 '12 at 00:08
  • Good. I don't know *what* the original timer was about. Remember, you can update *hundreds* of records a second, but only about 30 transactions/second: make sure the transaction scope is sufficient. –  Jul 23 '12 at 00:11
  • [here](http://pastebin.com/HfytcdDS) is my code. The ChromeSync function runs very slowly, about 5/s. The FirefoxSync can process around 5000 records in a matter of seconds. Why is ChromeSync so slow? – CC Inc Jul 25 '12 at 21:05
  • What's the difference between them? :) –  Jul 25 '12 at 21:49
  • Well, thats what I want to know! I think that they are the same, but one is alot slower. – CC Inc Jul 25 '12 at 22:16
  • @CCInc Cool. What was the issue? –  Jul 26 '12 at 19:51
  • The problem was, one command was alot slower than another. Now, both insert statements can complete about 5000 inserts in 5-6 seconds. – CC Inc Jul 26 '12 at 23:59
2

I'm not sure if this is your problem, but your general pattern of using ADO.NET is wrong - you shouldn't create new command(s) per each insert (and repeatedly pay for query preparation).

Instead, do the following:

  • Before the loop:
    • Create command(s) once.
    • Create appropriate bound parameters.
  • In the loop:
    • Just assign appropriate values to the bound parameters.
    • And execute the command(s).

You could also consider using less fine-grained transactions: try putting several inserts in the same transaction to minimize paying for transaction durability.

You might also want to take a look at this post.

Community
  • 1
  • 1
Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
0

You can try one of the following to improve performance :

  • Wrap all the inserts in a transaction - Can help in reducing the actual writes to the DB.
  • Use WAL - The Write-Ahead-Log is a journaling mode that speeds up writes and enables concurrency. (Not recommended if your DB is in a Network location).
  • Synchronous NORMAL - The Synchronous Mode dictates the the frequency at which data is actually flushed to the physical memory (fsync() calls). This can be time taking on some machines and hence the frequency at which this flush occurs is critical. Make sure to explicitly open connections with "Synchronous=NORMAL" ideal for most scenarios. There is a huge difference between Synchronous MODE as FULL and NORMAL (NORMAL is ~1000 times better).

Find more details in a similar post => What changed between System.Data.SQLite version 1.0.74 and the most recent 1.0.113?