-1

I'm currently stuck on how to create a statement that shows daily overdraft statements for a particular council.

I have the following, councils, users, markets, market_transactions, user_deposits.

market_transaction run daily reducing user's account balance. When the account_balance is 0 the users go into overdraft (negative). When users make a deposit their account balance increases.

I Have put the following tables to show how transactions and deposits are stored. if I reverse today's transactions I'm able to get what account balance a user had yesterday but to formulate a query to get the daily OD amount is where the problem is.

USERS

user_id name account_bal
1 Wells -5
2 James 100
3 Joy 10
4 Mumbi -300

DEPOSITS

id user_id amount date
1 1 5 2021-04-26
2 3 10 2021-04-26
3 3 5 2021-04-25
4 4 5 2021-04-25

TRANSACTIONS

id user_id amount_tendered date
1 1 5 2021-04-27
2 2 10 2021-04-26
3 3 15 2021-04-26
4 4 50 2021-04-25

The Relationships are as follows,

COUNCILS

council_id name
1 a
2 b
3 c

MARKETS

market_id name council_id
1 x 3
2 y 1
3 z 2

MARTKET_USER_LINK

id market_id user_id
1 1 3
2 2 2
3 3 1

I'm running this SQL query to get the total amount users have spent and subtracting with the current user account balance. Don't know If I can use this to figure out the account_balance for each day.

SELECT u.user_id, total_spent, total_deposits,m.council_id
FROM users u
JOIN market_user_link ul ON ul.user_id= u.user_id
LEFT JOIN markets m ON ul.market_id =m.market_id
LEFT JOIN councils c ON m.council_id =c.council_id
LEFT JOIN (
SELECT user_id, SUM(amount_tendered) AS total_spent
FROM transactions
WHERE DATE(date) BETWEEN DATE('2021-02-01') AND DATE(NOW())
GROUP BY user_id 
) t ON t.user_id= u.user_id
ORDER BY user_id, total_spent ASC

// looks like this when run

|   user_id   |  total_spent   | council_id |
|-------------|----------------|------------|
|     1       |      50.00     |     1      | 
|     2       |       2.00     |     3      | 

I was hoping to reverse transactions and deposits done to get the account balance for a day then get the sum of users with an account balance < 0... But this has just failed to work.

The goal is to produce a query that shows daily overdraft (Only SUM the total account balance of users with account balance below 0 ) for a particular council.

Expected Result

date council_id o_d_amount
2021-04-24 1 -300.00
2021-04-24 2 -60.00
2021-04-24 3 -900.00
2021-04-25 1 -600.00
2021-04-25 2 -100.00
2021-04-25 3 -1200.00
Wells
  • 142
  • 1
  • 13
  • 1
    MySQL != SQL Server - please correct your tags. – Dale K Apr 28 '21 at 22:03
  • 2
    @wells your output doesn't match given sample data , your sample data are for April 24 and later , while your output shows April 1st and 2nd , not usefull – eshirvana May 01 '21 at 15:47
  • @eshirvana, alright I was hoping it will help illustrate the end goal but I've updated them to suit the dates – Wells May 03 '21 at 09:28
  • Your MARKET_USER_LINK.market_id column appears to be wrong; it should contain integers. You should be using a site such as db-fiddle.com to define and initialize these table with the correct data. See [Why should I provide a Minimal Reproducible Example for a very simple SQL query?](https://meta.stackoverflow.com/questions/333952/why-should-i-provide-a-minimal-reproducible-example-for-a-very-simple-sql-query). – Booboo May 03 '21 at 12:00
  • @Booboo It was a mistaket, yes market_id is an integer – Wells May 03 '21 at 12:18
  • 1
    @Wells for overdraft amount shown in the output , I can't see how that is calculated using given sample data. explain one with given sample data in english – eshirvana May 03 '21 at 14:14

2 Answers2

1

This is actually not that hard, but the way you asked makes it hard to follow.

Also, your expected result should match the data you provided.

Edited: Previous solution was wrong - It counted withdraws and deposits more than once if you have more than one event for each user/date.

Start by having the total exchanged on each day, like

select user_id, date, sum(amount) exchanged_on_day from (
            select user_id, date, amount amount from deposits 
  union all select user_id, date, -amount_tendered amount from transactions
) d 
group by user_id, date
order by user_id, date;

What follows gets the state of the account only on days that had any deposits or withdraws. To get the results of all days (and not just those with account movement) you just have to change the cross join part to get a table with all dates you want (like Get all dates between two dates in SQL Server) but I digress...

select dates.date, c.council_id, u.name username
  , u.account_bal - sum(case when e.date >= dates.date then e.exchanged_on_day else 0 end) as amount_on_start_of_day
  , u.account_bal - sum(case when e.date > dates.date then e.exchanged_on_day else 0 end) as amount_on_end_of_day
from councils c
inner join markets m on c.council_id=m.council_id
inner join market_user_link mul on m.market_id=mul.market_id
inner join users u on mul.user_id=u.user_id
left  join (
  select user_id, date, sum(amount) exchanged_on_day from (
              select user_id, date, amount amount from deposits 
    union all select user_id, date, -amount_tendered amount from transactions
  ) d group by user_id, date
) e on u.user_id=e.user_id --exchange on each Day
cross join (select distinct date from (select date from deposits union select date from transactions) datesInternal) dates --all days that had a transaction
group by dates.date, c.council_id, u.name, u.account_bal
order by dates.date desc, c.council_id, u.name;

From there you can rearrange to get the result you want.

select date, council_id
, sum(case when amount_on_start_of_day<0 then amount_on_start_of_day else 0 end) o_d_amount_start
, sum(case when amount_on_end_of_day<0 then amount_on_end_of_day else 0 end) o_d_amount_end
from (
  select dates.date, c.council_id, u.name username
  , u.account_bal - sum(case when e.date >= dates.date then e.exchanged_on_day else 0 end) as amount_on_start_of_day
  , u.account_bal - sum(case when e.date > dates.date then e.exchanged_on_day else 0 end) as amount_on_end_of_day
  from councils c
  inner join markets m on c.council_id=m.council_id
  inner join market_user_link mul on m.market_id=mul.market_id
  inner join users u on mul.user_id=u.user_id
  left  join (
    select user_id, date, sum(amount) exchanged_on_day from (
                select user_id, date, amount amount from deposits 
      union all select user_id, date, -amount_tendered amount from transactions
    ) d group by user_id, date
  ) e on u.user_id=e.user_id --exchange on each Day
  cross join (select distinct date from (select date from deposits union select date from transactions) datesInternal) dates --all days that had a transaction
  group by dates.date, c.council_id, u.name, u.account_bal
) result
group by date, council_id
order by date;

You can check it on https://www.db-fiddle.com/f/msScT6B5F7FjU2aQXVr2da/6

Bruno Canettieri
  • 501
  • 2
  • 11
  • I've added more sample data, you may wish to test your solution with new data. – Serg May 07 '21 at 07:33
  • @Serg, Thanks. Difference is that you show the overdraft on the beginning of each day. A simple change in my query gets the same results as yours (using ">" as explained) – Bruno Canettieri May 08 '21 at 19:15
  • Correction, using ">=" – Bruno Canettieri May 08 '21 at 19:27
  • Do you mean `sum(case when e.date >= dates.date`? No, dating is still different. My query lists overdraft at the end of a day. For example the council 3 consists of a single user, 3. so when user 3 spends 17 at 2021-04-27, my query lists for council 3: `2021-04-26 -46`; `2021-04-27 -63`. And when the user deposits 73 at 2021-05-06 my query shows overdraft is 0 for that day. – Serg May 08 '21 at 21:13
  • You are right that your query lists the result at the end of day, my mistake. But our result are the same. I've updated my answer so that db-fiddle uses the same test data you provided. Now the difference is only because you add an extra date (2021-04-01). Or am I not seeing something obvious? – Bruno Canettieri May 09 '21 at 00:58
0

Basically the query maps users to councils, caculates periods of overdrafts for users, them aggregates over councils. I assume that starting balance is dated start of the month '2021-04-01' (it could be ending balance as well, see below), change it as needed. Also that negative starting balance counts as an overdraft. For simplicity and debugging the query is divided into a number of steps.

  with uc as (
   select distinct m.council_id, mul.user_id
   from markets m 
   join market_user_link mul on m.market_id = mul.market_id
  ),
  user_running_total as (
    select user_id, date, 
     coalesce(lead(date) over(partition by user_id order by date) - interval 1 day, date) nxt,
     sum(sum(s)) over(partition by user_id order by date) rt
    from (
       select user_id, date, -amount_tendered s
       from transactions 
       union all
       select user_id, date, amount
       from deposits
       union all
       select user_id, se.d, se.s
       from users
       cross join lateral (
            select date(NOW() + interval 1 day) d, 0 s
            union all
            select '2021-04-01' d, account_bal
        ) se
    ) t
    group by user_id, date
  ),
  user_overdraft as (
     select user_id, date, nxt, least(rt, 0) ovd
     from user_running_total 
     where date <= date(NOW()) 
  ),
  dates as (
     select date
     from user_overdraft
     union 
     select nxt
     from user_overdraft
  ),
  council__overdraft as ( 
     select uc.council_id, d.date, sum(uo.ovd) total_overdraft, lag(sum(uo.ovd), 1, sum(uo.ovd) - 1) over(partition by uc.council_id order by d.date) prev_ovd
     from uc
     cross join dates d
     join user_overdraft uo on uc.user_id = uo.user_id and d.date between uo.date and uo.nxt
     group by uc.council_id, d.date
  )
  select council_id, date, total_overdraft
  from council__overdraft
  where total_overdraft <> prev_ovd
  order by date, council_id  

Really council__overdraft is quite usable, the last step just compacts output excluding intermidiate dates when overdraft is not changed. With following sample data:

users

user_id name    account_bal
1   Wells   -5
2   James   100
3   Joy 10
4   Mumbi   -300

deposits, odered by date, extra row added for the last date

id  user_id amount  date
3   3   5   2021-04-25
4   4   5   2021-04-25
1   1   5   2021-04-26
2   3   10  2021-04-26
5   3   73  2021-05-06

transactions, odered by date (note the added row, to illustrate running total in action)

id  user_id amount_tendered date
5   4   50  2021-04-25
2   2   10  2021-04-26
3   3   15  2021-04-26
1   1   5   2021-04-27
4   3   17  2021-04-27

councils

council_id  name
1   a
2   b
3   c

markets

market_id   name    council_id
1   x   3
2   y   1
3   z   2

market_user_link

id  market_id   user_id
1   1   3
2   2   2
3   3   1
4   3   4

the query ouput is

council_id date overdraft
1 2021-04-01 0
2 2021-04-01 -305
3 2021-04-01 0
2 2021-04-25 -350
2 2021-04-26 -345
2 2021-04-27 -350
3 2021-04-27 -7
3 2021-05-06 0

Alternatively, provided the users table is holding a closing (NOW()) balance, replace user_running_total CTE with the following code

  user_running_total as (
    select user_id, date, 
       coalesce(lead(date) over(partition by user_id order by date) - interval 1 day, date) nxt,
       coalesce(sum(sum(s)) over(partition by user_id order by date desc
            rows between unbounded preceding and 1 preceding), sum(s))  rt
    from (
       select user_id, date, amount_tendered s
       from transactions 
       union all
       select user_id, date, -amount
       from deposits
       union all
       select user_id, se.d, se.s
       from users
       cross join lateral (
            select date(NOW() + interval 1 day) d, account_bal s
            union all
            select '2021-04-01' d, 0
        ) se
    ) t
    where DATE(date) between date '2021-04-01' and date(NOW() + interval 1 day)
    group by user_id, date
  ),

This way the query starts with closing balance dated next date after now and rollouts a running total in the reverse order till '2021-04-01' as a starting date.

Output

council_id date overdraft
1 2021-04-01 0
2 2021-04-01 -260
3 2021-04-01 -46
2 2021-04-25 -305
3 2021-04-25 -41
2 2021-04-26 -300
3 2021-04-26 -46
2 2021-04-27 -305
3 2021-04-27 -63
3 2021-05-06 0

db-fiddle both versions

Serg
  • 22,285
  • 5
  • 21
  • 48
  • Hey @Serg, Thank you for the extra test cases, caught an error on my solution. I think you got it wrong because the question states that "if I reverse today's transactions I'm able to get what account balance a user had yesterday" - meaning the starting balance is dated after the last date of the events tables. – Bruno Canettieri May 06 '21 at 00:20
  • @BrunoCanettieri, .. added the query version for the case when users table is holding a closing, NOW() balance. Thanks for the point. – Serg May 06 '21 at 07:39
  • Hey @Serg, if the overdraft is -305 on the last day, it can't possibly be -5 on the previous day (There is no operation large enough to make that happen). The problem is that in your inner query there is no row with the rt (balance) for an user/day when there is no operation - then your final query considers it as being zero. – Bruno Canettieri May 06 '21 at 13:41
  • @BrunoCanettieri,.. thank you, that's right. Added the dates step , more test data. – Serg May 06 '21 at 19:10