1

I have thousands of *.db files containing sensor data. All *.db files have the same structure like, ID,MACID,RSSI,ENTRY,EXIT,DEVICEID,UNIQUEID. I have seen a similar question like this one but the solution does not work for me.

What I have tried so far:

  1. Initially tried DB Browser for SQLite to individually convert each *.db to .csv format. It's a lot of copy-paste and I hated it!

  2. Using online forums, created a makeshift script as follows;

import sqlite3
from sqlite3 import Error

import pandas as pd
dataPath = "path_to_a_single_db_file"
# create database connection
conn = sqlite3.connect(dataPath)
# a custom function to create database connection
# Reference: http://www.sqlitetutorial.net/sqlite-python/sqlite-python-select/
def create_connection(db_file):
              try:
                   conn = sqlite3.connect(db_file)
                    return conn
              except Error as e:
                    print(e)

              return None
# create a function for reading & writing the Wifi table

def read_write_wifi_tbl_to_csv(conn):
            cur = conn.cursor()
            query='select * from WIFI'
            data=pd.read_sql(query,conn)
            data.to_csv('export.csv')

def main():
    # create a database connection
    conn = create_connection(dataPath)
    with conn:
      print("1. Query and writing to Wifi table")
      read_write_wifi_tbl_to_csv(conn)

if __name__ == '__main__':
                main()

The above code snippet works well for a single db file. How to make this work for multiple db files.

Spacedman
  • 92,590
  • 12
  • 140
  • 224
mnm
  • 1,962
  • 4
  • 19
  • 46
  • I dont understand why the code is not formatted properly. I'm not new to SO and in past i've edited many posts on code formatting, but today, I can't seem to properly format the code for my own post!!!! What an irony!. – mnm Jan 17 '19 at 07:48
  • I think I've fixed the formatting - you can use triple-back-quotes now to bracket code, so you don't have to put extra spaces in each line. Please check I've not messed anything critical up. – Spacedman Jan 17 '19 at 08:22
  • @Spacedman thanks a lot for your help in this regard. Also on the advice on using triple-back-quotes, greatly appreciate it. – mnm Jan 17 '19 at 08:25

1 Answers1

1

If you use python,

Use sqlite and csv packages to convert each sqlite .db file to csv file.

Then use pandas to read all csv files and write to single csv file using pandas.

EDIT: well, pandas is not required you can directly write to single csv file.

EDIT 2: Here is a quick notebook for that: https://github.com/darshanz/CombineMultipleSqliteToCsv

darshanz
  • 402
  • 3
  • 11
  • I'm not quite good with python. I'm more well-versed with `R`. What i see from online forums [1](http://www.sqlitetutorial.net/sqlite-python/sqlite-python-select/), [2](https://www.pythoncentral.io/introduction-to-sqlite-in-python/) is that I need to establish a database connection first, like, `db = sqlite3.connect('data/mydb')`. There is no database. All i have are individual `*.db` files. So how to solve this issue? – mnm Jan 17 '19 at 06:38
  • you can iterate over the files in a directory in R perhaps using lapply. – darshanz Jan 17 '19 at 07:13
  • Please check my notebook in the my last EDIT – darshanz Jan 17 '19 at 07:40
  • thanks for the inputs and your time. But it still does not answer my question. I've edited the question now. – mnm Jan 17 '19 at 08:04
  • Well, is the problem using multiple files. you can iterate through all the files using os package. I have added new python script you can use it directly. I tested with 2 files in the input folder. – darshanz Jan 17 '19 at 08:33
  • many thanks for your persistence spirit to help solve the issue. Your proposed solution works like a charm. Thanks a lot. – mnm Jan 17 '19 at 08:47