First let me begin by saying that I'm somewhat new to sql (but have been doing python for a long while). I've been having trouble finding a good solution to my problem on the net.
The problem:
I have an undefined number (although probably less than 100) of identically structured sqlite databases that I need to query and merge results from. The databases themselves are not particularly huge.
I've been looking at the ATTACH command and following things in this tutorial: http://souptonuts.sourceforge.net/readme_sqlite_tutorial.html
import sqlite3 as lite
Database = "library.db"
con = lite.connect(Database)
con.row_factory = lite.Row
cur = con.cursor()
cur.execute("ATTACH DATABASE 'library.db' As 'db1'")
cur.execute("ATTACH DATABASE 'library2.db' As 'db2'")
cur.execute("""
SELECT 'db1',* FROM db1.table
UNION
SELECT 'db2',* FROM db2.table
""")
but it seems like there should be a better way than to explicitly spell out each database in the execute command. Also, it looks like there's a limit to the number of databases that I can attach to? https://sqlite.org/limits.html
I've also looked at something like merging them together into a large database: How can I merge many SQLite databases? but it seems inefficient merge the databases together each time that a query needs to be made or one of the many individual databases gets changed.
Before I continue down some path, I was wondering if there were any better options for this type of thing that I'm not aware of?
Other potentially useful information about the project:
- There are two main tables.
- The tables from db to db can have duplicates.
- I need to be able to grab unique values in columns for the "merged" databases.
- Knowing which database what data came from is not essential.
- The individual databases are updated frequently.
- Queries to the "merged" database are made frequently.