4

I have a database of people names and their birthdays. The format of birthday is mm/dd/yyyy, like "3/13/1960".

I want to extract a list of people who are born after a specific date. I called this date "base".

The program that you see below, firstly creates a DB of people (to simulate the real DB that I want to work with), and then extracts the required list. The problem is that the result is not as I expect:

import datetime as dt
import peewee as pw
db = pw.SqliteDatabase('people1.db')

class Person(pw.Model):
    name = pw.CharField()
    birthday = pw.DateField(formats=['%m/%d/%Y'])
    class Meta:
        database = db # This model uses the "people.db" database.

db.create_tables([Person])


bob0 = Person(name='Bob0', birthday='4/13/1940')
bob1 = Person(name='Bob1', birthday='5/13/1950')
bob2 = Person(name='Bob2', birthday='3/13/1960')
bob3 = Person(name='Bob3', birthday='3/13/1970')
bob4 = Person(name='Bob4', birthday='3/13/1980')
bob5 = Person(name='Bob5', birthday='3/13/1990')

base = Person(name="base", birthday='3/13/1960')

bob0.save()
bob1.save()
bob2.save()
bob3.save()
bob4.save()
bob5.save()
base.save()

for item in Person.select().where(Person.birthday > base.birthday):
    print item.name , item.birthday

Output:

>>> ================================ RESTART ================================
>>> 
Bob0 1940-04-13
Bob1 1950-05-13
Bob3 1970-03-13
Bob4 1980-03-13
Bob5 1990-03-13
>>> 

As you see above, base = 3/13/1960. So I shouldn't have Bob0 and Bob1 in output! How can I handle it?

Note that, I don't want to change the format of birthdays in the database. I also don't want to fetch all the rows and check them later! I want to fetch the required rows only.

Mehdi Jafarnia Jahromi
  • 2,017
  • 3
  • 15
  • 14
Ebrahim Ghasemi
  • 5,850
  • 10
  • 52
  • 113
  • They do warn in the docs that "to ensure that comparisons work correctly, the dates need to be formatted so they are sorted lexicographically". Have you looked into peewee's custom fields and operators? – Paulo Almeida Sep 03 '15 at 15:15
  • @PauloAlmeida Ah, does that mean the date must be save in the format of `yyyy/mm/dd`? If so, what now? What shall I do now? Actually I checked the documentation, but I didn't saw this expression. – Ebrahim Ghasemi Sep 03 '15 at 15:39
  • Well, you don't _have_ to use any format, but of course it would make your life a lot easier if you could use the default. Other than that, as far as I can see, you will have to use custom fields and/or user-defined operators. They have an example of `mod` for the latter, in the documentation. It's not immediately obvious to me how you would do it, because it's more than just an operation, but I suppose it would be possible. – Paulo Almeida Sep 03 '15 at 15:48
  • http://stackoverflow.com/questions/17507477/comparison-dates-with-string-dd-mm-yyyy-format-in-sqlite – Padraic Cunningham Sep 05 '15 at 20:41
  • No offense but it's common sense that if you're storing your dates as strings and expect to sort them, they need to be ordered lexicographically. Just update your database and save yourself a lot of trouble. – coleifer Sep 24 '15 at 03:13
  • @coleifer database is not mine actually. I have a read access only. :) – Ebrahim Ghasemi Sep 24 '15 at 21:05

4 Answers4

3

SQlite stores date-times as strings. So as other have suggested in comments and other answers you should use a different format for storing dates, so that "date ordering and lexical ordering work out the same":

import datetime as dt
import peewee as pw

db = pw.SqliteDatabase('people1.db')

class Person(pw.Model):
    name = pw.CharField()
    birthday = pw.DateField(formats=['%Y-%m-%d'])
    class Meta:
        database = db # This model uses the "people.db" database.

db.create_tables([Person])


Person.create(name='Bob0', birthday=dt.date(1940, 4, 13))
Person.create(name='Bob1', birthday=dt.date(1950, 5, 13))
Person.create(name='Bob2', birthday=dt.date(1960, 3, 13))
Person.create(name='Bob3', birthday=dt.date(1970, 3, 13))
Person.create(name='Bob4', birthday=dt.date(1980, 3, 13))
Person.create(name='Bob5', birthday=dt.date(1990, 3, 13))

base = Person.create(name="base", birthday=dt.date(1960, 3, 13))

for item in Person.select().where(Person.birthday > base.birthday):
    print item.name , item.birthday

This gives:

Bob3 1970-03-13
Bob4 1980-03-13
Bob5 1990-03-13

UPDATE

I haven't noticed your comment that you don't want to change the database.

Here is a crazy way to extract parts of the date:

SELECT
    birthday,
    CAST(substr(birthday, 1, instr(birthday, '/') - 1) AS integer),
    CAST(substr(substr(birthday, instr(birthday, '/') + 1), 1, instr(substr(birthday, instr(birthday, '/') + 1), '/') - 1) AS integer),
    CAST(substr(birthday, instr(birthday, '/') + instr(substr(birthday, instr(birthday, '/') + 1), '/') + 1) AS integer)
FROM person

which on my test data gives:

4/13/1940   4   13  1940
12/13/1950  12  13  1950
3/3/1960    3   3   1960
3/25/1970   3   25  1970
3/13/1980   3   13  1980
3/13/1990   3   13  1990
3/13/1960   3   13  1960

You can use these expressions to compare them with parts of the given date:

query = """
SELECT *
FROM person
WHERE
    (
        substr('0000' || CAST(substr(birthday, instr(birthday, '/') + instr(substr(birthday, instr(birthday, '/') + 1), '/') + 1) AS integer), -4, 4) || '-' || -- year
        substr('00' || CAST(substr(birthday, 1, instr(birthday, '/') - 1) AS integer), -2, 2) || '-' || -- month
        substr('00' || CAST(substr(substr(birthday, instr(birthday, '/') + 1), 1, instr(substr(birthday, instr(birthday, '/') + 1), '/') - 1) AS integer), -2, 2) -- day
    ) > '1960-03-03'
"""
for item in Person.raw(query):
    print item.name, item.birthday

I am reconstructing ISO date here and use it for comparison.

Community
  • 1
  • 1
warvariuc
  • 57,116
  • 41
  • 173
  • 227
  • Actually I knew this solution already. As I said in the question, I don't want to change the DB. It's a big online file that changing continuously and I prefer to don't change it. – Ebrahim Ghasemi Sep 05 '15 at 18:46
  • 1
    http://stackoverflow.com/questions/1975737/sqlite-datetime-comparison Answerer is right. @Abraham The only solution I see is using some string manipulation functions (like `substr`) in condition to compare your dates, if you won't change the format of date. – dmitry Sep 05 '15 at 18:57
  • @dmitry I tried some of them, but I failed. I received some weird exceptions. May I ask you to please provide an example? :) – Ebrahim Ghasemi Sep 05 '15 at 19:12
  • 1
    Peewee has `fn` for defining SQL function calls in expressions. It would look something like `fn.substr(Person.birthday, 5, 4) > '1960'`. You could create some expression using substr. Though there's still a problem in initial date format: variable length of day and month parts. – dmitry Sep 05 '15 at 19:26
  • 1
    The only sane way to use *database* for your task is to change format to supported. – dmitry Sep 05 '15 at 19:33
  • Thank you again dear @warvariuc, but again, as I said in my question notes, I don't want to fetch all the rows and check them later. Using your updated solution, I will receive all the database data and extract the year,month, day from their birthday and then I must check them for that condition (being greater than `base` birthday.). But I don't want to do that, I want to receive exactly what I want from the database. You know? Using your solution I must download all the data from the online database and check it locally,but I want to download just the required data (check date on the server) – Ebrahim Ghasemi Sep 06 '15 at 06:04
  • @warvariuc Ah, may I ask you to modify my program in the way that you mean? where I must add this expressions? `for item in Person.select().where( -- > here? <-- )`? Thanks a lot. – Ebrahim Ghasemi Sep 06 '15 at 07:50
  • @Abraham, see update. I hope you will manage to substitute the reference date. See also [this](http://peewee.readthedocs.org/en/latest/peewee/querying.html#security-and-sql-injection) – warvariuc Sep 06 '15 at 12:16
  • Dear warvariuc, your answer also was great and worked fine for me but for some reasons I prefer to mark @Jeremy's post as my accepted answer. I wanna thank you again my friend. <3 – Ebrahim Ghasemi Sep 08 '15 at 04:59
2

You can use sqlite3.Connection.create_function to specify your own sqlite function that will transform your dates into something that can be lexicographically sorted:

import datetime as dt
import peewee as pw

# custom sqlite function to reformat our date string
def _sqlite_reformat_date(unfortunate_date_string):
    return dt.datetime \
        .strptime(unfortunate_date_string,'%m/%d/%Y') \
        .strftime('%Y-%m-%d')

# Subclass pw.SqliteDatabase to add our custom sqlite function
class MySqliteDatabase(pw.SqliteDatabase):
    def __init__(self, *args, **kwargs):
        super(MySqliteDatabase, self).__init__(*args, **kwargs)

    def _add_conn_hooks(self, conn):
        conn.create_function('reformat_date', 1, _sqlite_reformat_date)
        super(MySqliteDatabase, self)._add_conn_hooks(conn)

db = MySqliteDatabase('people1.db')

# ...
# Your model definition and data inserts from your example above
# ...

rd = pw.fn.reformat_date # Use our custom sqlite function
for item in Person.select().where(rd(Person.birthday) > rd(base.birthday)):
    print item.name , item.birthday

Although this approach will 'fetch the required rows only' it will still run this python function for every row! It is little better than just fetching all the rows in doing the date comparison in python, it may even be slower!

However the _sqlite_reformat_date function could be refactored to be much faster and its nice to know how easy it is to add a custom function into sqlite.

Jeremy Allen
  • 6,434
  • 2
  • 26
  • 31
  • Seems great. I'll check it. – Ebrahim Ghasemi Sep 06 '15 at 07:52
  • @Jeremy, is this feature of 'adding your own function' available on MySQL database as well? – Mehdi Jafarnia Jahromi Sep 06 '15 at 15:15
  • @MehdiJafarniaJahromi you can add new sql functions in a number of different ways: see [Adding New Functions to MySQL](https://dev.mysql.com/doc/refman/5.7/en/adding-functions.html). However I can't immediately see an approach similar to this 'sql function calls a python function' approach – Jeremy Allen Sep 06 '15 at 18:33
  • This is going to be extremely inefficient. – coleifer Sep 06 '15 at 23:55
  • @MehdiJafarniaJahromi As far as I know MySQL supports datetime formats and you won't need to define any function for that. Am I right dear Jeremy? – Ebrahim Ghasemi Sep 07 '15 at 01:58
  • With both approaches (using a user space function and reconstructing the date) you will not be able to use indexes, so filtering/ordering by date will be very slow. – warvariuc Sep 07 '15 at 07:02
0

Just some chaining of SQL functions looks funny but works, maybe faster.

from datetime import datetime

import peewee as pw

db = pw.SqliteDatabase('people1.db')

class Person(pw.Model):
    name = pw.CharField()
    birthday = pw.DateField(formats=['%m/%d/%Y'])
    class Meta:
        database = db # This model uses the "people.db" database.

db.create_tables([Person])


bob0 = Person(name='Bob0', birthday='4/13/1940')
bob1 = Person(name='Bob1', birthday='5/13/1950')
bob2 = Person(name='Bob2', birthday='3/13/1960')
bob3 = Person(name='Bob3', birthday='3/13/1970')
bob4 = Person(name='Bob4', birthday='3/13/1980')
bob5 = Person(name='Bob5', birthday='3/13/1990')
bob6 = Person(name='Bob6', birthday='12/1/1990')

base = Person(name="base", birthday='3/13/1960')

bob0.save()
bob1.save()
bob2.save()
bob3.save()
bob4.save()
bob5.save()
bob6.save()
base.save()

month = 'substr(birthday,1,instr(birthday,"/")-1)'
iso_month = 'case when length({month}) = 1 then "0" || {month} else {month} end'.format(month=month)
day = 'trim(trim(birthday,"0123456789"),"/")'
iso_day = 'case when length({day}) = 1 then "0" || {day} else {day} end'.format(day=day)
year = 'substr(ltrim(ltrim(birthday,"0123456789"),"/"),instr(ltrim(ltrim(birthday,"0123456789"),"/"),"/")+1)'
iso_date = 'replace(replace(replace("yyyy-mm-dd","yyyy",{year}),"mm",{iso_month}),"dd",{iso_day})'.format(year=year,iso_month=iso_month,iso_day=iso_day)
iso_base = datetime.strptime(base.birthday,'%m/%d/%Y').date().isoformat()

if __name__ == '__main__':

    for item in Person.select().where(pw.SQL(iso_date) > iso_base):
        print item.name , item.birthday
#output
#Bob3 1970-03-13
#Bob4 1980-03-13
#Bob5 1990-03-13
#Bob6 1990-12-01
Nizam Mohamed
  • 8,751
  • 24
  • 32
-1

SQLite stores dates as strings. For this reason they should be stored as YYYY-MM-DD, this ensures they will be sorted correctly. There is no reason to do otherwise, honestly.

If you look at sqlite's docs, it does not even recognize dates in the format you're using:

https://www.sqlite.org/lang_datefunc.html

So, my advice is to update the way you're storing dates.

Otherwise, create a user-defined function that does the correct thing using strptime (assumes the use of playhouse.sqlite_ext.SqliteExtDatabase):

@db.func()
def fix_time(s):
    return datetime.datetime.strptime(s, '%m/%d/%Y').strftime('%Y-%m-%d')

If you want to stick with regular old SqliteDatabase, you can call the sqlite3 method connection.create_function.

ChrisF
  • 134,786
  • 31
  • 255
  • 325
coleifer
  • 24,887
  • 6
  • 60
  • 75