I use a SQL Server CE database for some simulation results. When I test the reading speed the benchmark-times differ greatly.
- dbs are split to 4
.SDF
files (4 quartals) - 302*525000 entries overall
- four
SqlCeConnection
s - they are opened before reading and stay open
- all four databases are located on the same disk (SSD)
- I use
SqlCeDataReader
(IMHO as low-level and fast as you can get) - Reading process is parallel
Simplified code
for (int run = 0; run < 4; run++)
{
InitializeConnections();
for (int reading = 0; reading < 6; reading++)
{
ResetMemoryObjects();
Parallel.For(0, quartals.Count, (i) =>
{
values[i] = ReadFromSqlCeDb(i);
}
}
}
- connections are initialized each run 1 time
- all readings take place in a simple for loop and are exactly the same
- before each reading all the objects are reinitialized
These are the benchmark results I get:
At this point I'm honest - I have no idea why SQL Server CE behaves that way. Maybe someone can give me a hint?
Edit 1: I made a more in depth analysis of each step during the parallel reading. The following chart shows the steps, the "actual reading" is the part inside the while(readerdt.Read()) section.
Edit 2:
After ErikEJ' suggestion I added a TableDirect Approach and made 150 runs, 75 for SELECT and 75 for TableDirect. I summed up the Pre- & Postprocessing of the reading process, because this remains stable and nearly the same for all runs. What differs vastly is the actual reading process.
Every second run was done via TableDirect, so they both start to get drastic better results at around run 65 simultaneously. The range goes from 5.7 second up to 37.4 seconds.
This is the "acutal reading" code. (there are four different databases/sdf files with four different SqlCe connections. Tested on Ryzen7 8-Core CPU)
private static List<List<double>> ReadDataFromDbTableDirect((double von, double bis) timepoints, List<(string compName, string resName)> components, int dbIdx)
{
var values = new List<List<double>>();
for (int j = 0; j < components.Count; j++)
{
values.Add(new List<double>());
}
using (var command = sqlCeCon[dbIdx].CreateCommand())
{
command.CommandType = CommandType.TableDirect;
command.CommandText = table.TableName;
command.IndexName = "PKTIME";
command.SetRange(DbRangeOptions.InclusiveStart | DbRangeOptions.InclusiveEnd, new object[]{timepoints.von},new object[]{timepoints.bis});
using (var reader = (SqlCeDataReader)command.ExecuteReader(CommandBehavior.Default))
{
while (reader.Read())
{
for (int j = 0; j < components.Count; j++)
{
if (!reader.IsDBNull(j))
{
if (j == 0)
{
values[j].Add(reader.GetInt32(j));
}
else
{
values[j].Add(reader.GetDouble(j));
}
}
}
}
}
}
return values;
}
Still no idea why it has such a great delay.
SDF looks like this
Edit 3:
Today I made the same approach with a single database instead of four databases (to exclude problems with Parallel/Tasks). While here TableDirect has a little advantage, the main problem of differing reading speed persists (the sdf data is the same so it is comparable).
Edit 4: These are the results on another machine. Still large outbursts, but a bit more stable. Overall still same issue.
Edit 5: These are the results of a 4x smaller db - 500 runs. TableDirect & Select are (as in previous benchmarks) run alternately, but to better see the results in the graph they are shown in sequence. Notice that the overall time here is not 4-times smaller as you'd expect, but ~8-times smaller. Same problem with the high reading time in the beginning. Next I'll optimize ArrayPooling and stuff...
Edit 6: To further investigate I tried this on 2 different machines. Pc#2 has Win11Home, Ryzen 5, SSD (quite new) and no Antivir, Pc#3 has Win10Pro - pristine installation, everything deactivated (WinDefender), SSD, Ryzen 7. On Pc#3 there is just one peak (first run), on Pc#2 there are several besides the initial requests.
Edit 7 : After ErikEJ's suggestion it might be due to an Index Rebuild, I tried several things.
- A test the reading times after a fresh simulation (db is freshly built in the simulation)
- B test the reading times after copying and loading a db from another folder and apply a Verify(SqlCeEngine) on the db (Verify)
- C test the reading times after copying and loading a db from another folder (no special db treatment)
- D test the reading times after copying and loading a db from another folder, then make a quick first call with one row of all cols (preparation call)
- I also tested SqlCeEngine Repair & Compact. They had nearly the same results as B
- It seems like a verification solves the problem with the initial reading speed. Unfortunately the verification itself takes quite long (>10s on big dbs). Is there a quicker solution for this?
- Result D is a complete surprise to me (mind the different scale). I don't understand what is happening here.. Any guesses?
- Result C shows the long reading times on initial readings, but not always, which is irritating. Maybe it is not the index rebuild which is causing this?
- I still have very huge variations in the reading speed on bigger databases.
I'm currently working on an improved reading process with pointers/memory to reduce GC pressure. I will make the same tests today on another machine.
If anyone has an idea how to improve/stabilize reading speeds please let me know! Thanks in advance!