0

I'm trying to create a function using Python's Sqlite3 module that will return a list of rows based on a datetime timestamp search using the SELECT command.

Right now, selecting everything (with 'select * from example_table') will return a entire rows correctly, but selecting based on timestamps (using '''select %s from %s where %s > ? and %s < ?''' % (date_col_name, table_name, date_col_name, date_col_name)) will only return the timestamp in a tuple (missing the other column).

Previously, I struggled to preserve the datetime data type, but with detect_types=sqlite3.PARSE_DECLTYPES it returns an actual datetime object. Normally I would suspect this to be the issue, however the actual "filtering" part of the SELECT command is working. The correct datetime objects are being returned, they're just missing the other datas in the row they belonged to.

Relevant code:

sql_database_name = 'data_history.db'
date_col_name = 'Date'

class Database manager ...

    ... init ...

    def get_table_range(self, table_name, daterange=None):
        con = sqlite3.connect(self.database_name, detect_types=sqlite3.PARSE_DECLTYPES)
        c = con.cursor()

        if daterange is not None:
            startdate = daterange[0]
            enddate = daterange[1]
            sql = '''select %s from %s where %s > ? and %s < ?''' % (date_col_name, table_name, date_col_name, date_col_name)
            data = (startdate, enddate)
            c.execute(sql, data)
        else:
            sql = 'select * from %s' % table_name
            c.execute(sql)

        print("Fetchine one: ", c.fetchone())

        result = c.fetchall()

        c.close()
        con.close()

        return result


if __name__ == "__main__":
    test_db = "test.db"
    manager = DatabaseManager(database_name=test_db)

    selected = manager.get_table_range("test_table")
    print("Selected " + str(len(selected)) + "rows.")

    print("---")

    selected = manager.get_table_range("test_table", (datetime(2020, 3, 2, 23), datetime(2020, 3, 3)))
    print("Selected " + str(len(selected)) + "rows.")

Actual output:

Fetchine one:  (datetime.datetime(2020, 3, 2, 19, 12, 57, 120184), 291.0)
Selected 97rows.
Fetchine one:  (datetime.datetime(2020, 3, 2, 23, 22, 15, 704786),) <<-- extra columns were not returned
Selected 25rows.

Desired output:

Fetchine one:  (datetime.datetime(2020, 3, 2, 19, 12, 57, 120184), 291.0)
Selected 97rows.
Fetchine one:  (datetime.datetime(2020, 3, 2, 23, 22, 15, 704786), XXX.X) 
Selected 25rows.
Will
  • 170
  • 10
  • Oh man, you're using string formatting to build queries. This is open to SQL injection – roganjosh Mar 03 '20 at 18:15
  • You're trying to parameterize a query that has no placeholders, because you used them for string formatting (and SQLite uses `?`). I suggest you take a step back here because there are several misconceptions – roganjosh Mar 03 '20 at 18:18
  • [`.format()` method is meant to replace the old `%` formatting syntax. The latter has been de-emphasised, (but not officially deprecated *yet*).](https://stackoverflow.com/a/13452357/1422451) – Parfait Mar 03 '20 at 18:27
  • @roganjosh, yup! I have quite a bit of learning to do, thankfully this won't ever be exposed to an outside network, but I will eventually refactor it for posterity (once I have a working example) – Will Mar 03 '20 at 18:57

1 Answers1

1

It's returning exactly what you're asking for, since your query translates into:

 SELECT date_col_name FROM table_name WHERE date_col_name < ? AND date_col_name > ?;

You probably want to generate the code SELECT * FROM. . . or SELECT list, of, column, names FROM . . ..

And why are you building the SQL statement with string replacement for the column and table names? You should just be writing:

 sql = '''select date_col_name from table_name where date_col_name > ? and date_col_name < ?''' 

The column names are not variables (unless you have some strange set-up with multiple identically structured tables with different name and different column names, which could possibly indicate a design flaw).

Larry Lustig
  • 49,320
  • 14
  • 110
  • 160
  • Your last sentence is exactly what I have set up. I have, within a small network, many small microcontrollers (ESP8266) sending temperature data from various points in a given space. I have a table for each sensor and a datetime + temp column for each table, with each table named after each sensor. These sensors do NOT measure at the same time intervals, so I didn't think I could include them in the same table. Is there a better way that pops out to you? – Will Mar 03 '20 at 19:02
  • I was taught that names/constants shouldn't be "hard coded". In case I need to change them later, I would only have to edit a single line. Is this not the case for SQL? I appreciate the tips! This is exactly why I was asking as well, I'm very new to SQL/databases. – Will Mar 03 '20 at 19:06
  • It would be presumptuous of me to dictate the design of your application when I have so little information about it. That said :-), my strong inclination would be `CREATE TABLE Measurements (SensorID TEXT, MeasureWhen DATETIME, Temperature REAL)`. If the interval per sensor is important to you (for instance, if the same application is actually controlling the sensors) I would have an additional table `CREATE TABLE Sensors (SensorID TEXT, IntervalInSeconds INT)`. I guess I'd have that table anyway, to verify SensorIDs against. – Larry Lustig Mar 03 '20 at 19:20
  • Would this method not greatly slow down lookup times if I wanted to pull a specific region of data to graph every few seconds? Or does indexing (admittedly, this is my next subject to learn for databases) largely resolve this issue? My original motivation for separate tables was that I could cut down on the amount of data to sift through later on when getting a sensors history within a time range. – Will Mar 03 '20 at 19:25
  • Indexing largely resolves this issue. Design the database correctly. If you later experience performance issues (say, when you reach 100 million measurements or so) you can think about "sharding" the data to improve performance -- it's only when that happens that you'll have practical information about where the performance issues are. – Larry Lustig Mar 03 '20 at 19:32