3

I'm using Flask-SQLAlchemy and I have the following simplified model which I use to track the tasks that I've completed throughout the day. What I'm trying to achieve is to calculate total time spent for each recorded task grouped by task id and client id.

S = TypeVar("S", bound="Tracker")

class Tracker(db.Model):

id = db.Column(Integer)
datetime_start = db.Column(DateTime)
datetime_end = db.Column(DateTime)
task_id = db.Column(Integer, ForeignKey)
client_id = db.Column(Integer, ForeignKey)

The raw SQL query that I have that would give me the result that I want is as such:

SELECT client_id , task_id, sum(difference)
FROM (
    SELECT id, 
        datetime_start, 
        datetime_end, 
        client_id, 
        task_id , 
        datetime_end-datetime_start as difference
    FROM trackers
    WHERE client_id NOTNULL
) AS s
GROUP BY task_id , client_id 
ORDER BY client_id 
Raw Data
"id","datetime_start","datetime_end","client_id","task_id"
1,2022-02-21 12:00:00.00000+0800,2022-02-21 12:30:00.00000+0800,347,3
2,2022-02-21 12:30:00.00000+0800,2022-02-21 12:50:00.00000+0800,271,4
3,2022-02-21 13:00:00.00000+0800,2022-02-21 13:20:00.00000+0800,34,1
4,2022-02-21 13:20:00.00000+0800,2022-02-21 13:30:00.00000+0800,347,1
7,2022-02-21 14:50:00.00000+0800,2022-02-21 15:40:00.00000+0800,271,4
8,2022-02-21 15:45:00.00000+0800,2022-02-21 16:45:00.00000+0800,271,6
9,2022-02-21 18:00:00.00000+0800,2022-02-21 19:30:00.00000+0800,29,3
Post-Calculation Data
"client_id","task_id","sum"
29,3,"01:30:00"
34,1,"00:20:00"
271,4,"01:10:00"
271,6,"01:00:00"
347,3,"00:30:00"
347,1,"00:10:00"

My current class method query code is as follows: (Inspired from: https://stackoverflow.com/a/33509857 & https://stackoverflow.com/a/27423865) but it doesn't work (Errors out with BaseQuery is not callable - This leads me down another rabbit hole) which I suspect it to be because I'm using model based query but the example shown in the link is declarative based.)

@classmethod
def custom_report(cls: Type[S], filters: Union[list, BooleanClauseList]):
        return cls.query((cls.datetime_end - cls.datetime_start).label("difference"))
            .filter(*filters)
            .group_by(cls.task_id, cls.client_id)
            .order_by(cls.client_id)

I understand my custom_report query is incomplete because it still doesn't address the aggregate SUM function but before I can get there, I'm already stuck where I'm currently at. I do however plan to use subquery() as pointed out here to accomplish the remaining portion: https://stackoverflow.com/a/38880249

Update 1: Based on the responses from @snakecharmerb and @CAMILO JOSÉ CRUZ RIVERA, here's the revised query that worked:

@classmethod
def custom_report(cls: Type[S], filters: Union[list, BooleanClauseList]):
    return (
        db.session.query(
            cls.task_id,
            cls.client_id,
            func.sum((cls.datetime_end - cls.datetime_start).label("difference")),
        )
        .filter(*filters)
        .group_by(cls.task_id, cls.client_id)
        .order_by(cls.client_id)
    )
jtsnooping
  • 35
  • 1
  • 4
  • 1
    Try using `db.session.query(cls.task_id, cls.client_id, (cls.datetime_end - cls.datetime_start).label("difference"))`. – snakecharmerb Feb 21 '22 at 10:10
  • @snakecharmerb, Thank you for the response. I did some further digging and found out why your suggestion worked. From this post: https://stackoverflow.com/a/40919686/7975045, my existing code was performing a model query when in fact I shouldn't be doing since I require selected fields and also my query involves an additional column which isn't part of the model. Also to answer your question, I'm using PostgreSQL – jtsnooping Feb 22 '22 at 16:07

1 Answers1

2

You are trying to group from a column that you aren't querying.

Try including this fields on the query

@classmethod
def custom_report(cls: Type[S], filters: Union[list, BooleanClauseList]):
        return cls.query(cls.task_id, cls.client_id, (cls.datetime_end - cls.datetime_start).label("difference"))
            .filters(*filters)
            .group_by(cls.task_id, cls.client_id)
            .order_by(cls.client_id)

  • 1
    Thank you, what should have been obvious to me from the outright was that my raw SQL query had the required fields in my SELECT statement but wasn't present in my code. I've revised my original question with the working code. – jtsnooping Feb 22 '22 at 16:12