I'm writing a program that uses a sqlite3 database file to store its data. If I open a database file with
sqlite3_open_v2(filename, &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, NULL)
the database file is created if it does not exist. How can I find out if the database file existed prior to opening it? The sqlite3 shell uses code like this:
/* Go ahead and open the database file if it already exists. If the
** file does not exist, delay opening it. This prevents empty database
** files from being created if a user mistypes the database name argument
** to the sqlite command-line tool.
*/
if( access(data.zDbFilename, 0)==0 ){
open_db(&data, 0);
}
This code however has a race condition when the database file is created by another process after the access
call and before the open_db
call (time-of-check vs. time-of-use).
Another answer (that I can't find right now) suggests to check the application_id
and user_version
fields. If they are zero, a database was just created. I researched this approach and found that numerous applications actually don't bother to set these fields in newly created databases, so this approach is fuzzy at best and I don't think it solves my problem.
Is there a method to find out if the database existed prior to opening it that does not involve a race condition like this? It is also acceptable if I can merely find out if the database file was initialized (as in, a truncated file was populated with a sqlite3 header) by sqlite3.
The purpose of having such a routine is to be able to find out if I need to create all the tables I need in the database. I don't want to accidentally overwrite another file placed there by a different application.
Sample code
A simplified illustration of the problem can be found in the following bash script. It simulates two applications. They both work in similar ways:
- Create or open a database
test.db
- If the database did not exist before, create a table
test
and write a single row into it. The value is 1 for the first application and 2 for the second. - Print the contents of the database.
Here is the code:
#!/bin/sh
# sleeps are inserted to make the race conditions easier to trigger
application() {
echo Checking if database exists...
[ ! -f test.db ]
status=$?
sleep 2
if (exit $status)
then
echo Database not found, making tables
sqlite3 test.db "CREATE TABLE IF NOT EXISTS test (a);"
sleep 2
echo Writing initial record into database
sqlite3 test.db "INSERT INTO test VALUES ($1);"
sleep 2
else
echo Database found, checking if it belongs to me
fi
echo Printing content of database
sqlite3 test.db "SELECT * FROM test;"
}
rm -f test.db
echo First test: app1 and app1 race
application 1 & (sleep 1 ; application 1)
rm -f test.db
echo Second test: app2 and app1 race
application 2 & (sleep 1 ; application 1)
My goal is to ensure that the following cases can never happen:
- An instance of application one opens the database file, concludes that it isn't initialized and initializes it (by creating the table and writing the initial record) even though it already contains data from a different instance of the same application or from a different application.
- An database file belonging to a different application is written into.
If application
was programmed correctly, each run would only initialize the database if it wasn't initialized before. Thus, the only output you would see is either a database containing only the row 1
or a database that contains only the row 2
.