I would like to retrieve a random set of X entries from my postgres database using sqlalchemy. My first approach was this
random_set_of_Xrows = models.Table.query.filter(something).order_by(func.random()).limit(len(X)).all()
since my Table is quite big, this command takes about 1 second, and I was wondering how to optimise it. I guess the order_by function requires to look at all rows, so I figured using offset instead might make it faster. However, I can't quite see how to avoid the row count entirely? Here is an approach using offset
rowCount = db.session.query(func.count(models.Table.id)).filter(something).scalar()
random_set_of_Xrows = models.Table.query.offset(func.floor(func.random()*rowCount)).limit(len(X)).all()
which however is not faster, with most of the time spent getting rowCount. Any ideas how to make this faster? cheers carl
EDIT: As suggested below I added a column to the table with a random value and used that to extract the rows like
random_set_of_Xrows = models.Table.query.filter(something).order_by(models.Table.random_value).limit(len(X)).all()
I did ignore the offset part, since it doesn't matter to me if two calls give me the same results, I just need a random set of rows.