0

I have a sqlite table which I create as follows:

CREATE TABLE data (Year TEXT, County TEXT, Region TEXT, Town TEXT, Page TEXT, Surname TEXT, Forename TEXT, Age TEXT, Sex TEXT, Relation TEXT, Religion TEXT, Birthplace TEXT, Occupation TEXT, Literacy TEXT, Language TEXT, Marital_Status TEXT, Specified_Illnesses TEXT, Years_Married TEXT, Children_Born TEXT, Children_Living TEXT, House_Name TEXT);

The table has no relation to any other tables in my database. There are 9 million rows in the table. The db is 1.5G. There are huge variations in the time it takes to run single queries, for example:

sqlite> .timer ON
sqlite> SELECT DISTINCT Year FROM census_data;
[Two values outputted]
Run Time: real 18.921 user 1.880007 sys 0.932251

I then ran:

sqlite> SELECT DISTINCT County FROM census_data;
[50 values outputted]
Run Time: real 6.444 user 2.437792 sys 0.746738

Then:

sqlite> SELECT DISTINCT Year FROM census_data;
[Two values outputted]
Run Time: real 5.643 user 1.717557 sys 0.758017

Then:

sqlite> SELECT DISTINCT County FROM census_data;
[50 values outputted]
Run Time: real 3.106 user 2.394552 sys 0.689175

I then started writing this post before going back to sqlite3 and this time I get:

sqlite> SELECT DISTINCT County FROM census_data;
[50 values outputted]
Run Time: real 90.853 user 2.691074 sys 1.718229

Is my table the problem or do I need to provide sqlite3 with more memory and if so how do I do this?

UPDATE

I should point out that I don't write to the database now having created it and I ran the following just in case the db was corrupt but the queries are still showing this time issue:

sqlite3 data.sqlite ".dump" | sqlite3 new.sqlite
Baz
  • 12,713
  • 38
  • 145
  • 268
  • From where are you calling SQLite? Could it be that this other process has latency, or maybe that SQLite itself is being called by other processes? – Tim Biegeleisen Dec 06 '18 at 11:19
  • @Tim Biegeleisen I'm running the sqlite3 command from within my terminal on Mac os x. – Baz Dec 06 '18 at 11:27
  • 1
    Do you have the option to create indexes on this table? – TEEKAY Dec 06 '18 at 13:21
  • [Potentially helpful reading](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1) – glibdud Dec 06 '18 at 13:37
  • If you don't have indexes on those columns, every row needs to be scanned, and that can easily take different times depending on factors like how much of the database file is currently being cached by the OS, hard drive speed, what other programs are doing, etc. – Shawn Dec 06 '18 at 15:55
  • 1
    If there's an index on that column, only that index's b-tree needs to be scanned, which is a lot fewer pages to load than the table proper (Thus less disk operations), plus since the values in the index are sorted, it doesn't have to store the already-seen ones in a temporary b-tree and look up every single value in that to see if it's already been encountered (And insert ones that haven't). – Shawn Dec 06 '18 at 20:40

0 Answers0