I have several txt files with entries containing different latitude and longitudinal coordinates as well as some other data. I want to input these data into separate sqlite tables, differentiated by their latitude and longitudinal coordinates. All in all there are 23 geographical zones that I want to give a table. However, I am a little unsure on how I best will do that. My idea so far is to read and input the data into one databasetable, then differentiate it into different sones:
cur.execute("CREATE TABLE Total (unixtime int, mmsi int, latitude float, longitude float, port int)".format(site = site))
cur.execute("CREATE INDEX latitude_index ON Total (latitude)")
cur.execute("CREATE INDEX longitude_index ON Total (longitude)")
def writeToDatabase(databasepath, tablename, data):
con = lite.connect(databasepath)
cur = con.cursor()
query = """
INSERT INTO
{table} (unixtime, mmsi, latitude, longitude, port)
values (?,?,?,?,?)
""".format(table=tablename)
cur.executemany(query, (data))
if __name__ == "__main__":
words = []
port = []
mmsi = []
unixtime = []
longitude = []
latitude = []
newdata = []
databasepath = "SAISGlobal.db"
createDatabaseTable(databasepath)
for filename in os.listdir('data'):
with open("data/"+filename) as f: # Use file to refer to the file object
data = f.readlines()
for line in data:
words = line.split(";")
port.append(int(words[0]))
mmsi.append(int(words[1]))
unixtime.append(int(words[6]))
longitude.append(float(words[2]))
latitude.append(float(words[3]))
newdata = zip(unixtime, mmsi, latitude, longitude, port)
writeToDatabase("SAISGlobal.db", 'Total', newdata)
The idea is to then to extract the data from the Total table, and insert it to different tables for each zone using SQLite queries. However, there might be a better way to do this? Each file contains 200-300 mb of data, and there is 300-400 files.
Could I for example differentiate the data into tables before I insert into database? As it is 23 sones, I'm really unsure on how I would do this. If anyone got an idea?
Any other suggestions?
Edit:
Running time up to the writeToDatabase is 17.17 seconds for one file, the writeToDatabase takes 60-70 seconds for one file. If I remove the indexes I reduce the writeToDatabase to 9-20 seconds, however I need to write this data to other tables as well, and in the other tables I will need the other indexes.