Suppose you have a table like this:
table users:
- userid (PK, id of the user)
- regdate (datetime, date of registration)
- idprovince (external id of address province)
If I write a query as such:
SELECT DATE_FORMAT(regdate,"%Y-%m-01") as regmonth, idprovince , count(userid) as num FROM users GROUP BY DATE_FORMAT(regdate,"%Y-%m-01"), idprovince
This will correctly produce a grouped result which will show how many NEW users have registered in any given month and province.
Suppose now I want the cumulative number of users in each province in any given month (the value for any given month and province should be the sum of the new users of that month and province AND all the previous months of that very same province), how should I build the query to be efficient?
I've tried using a subquery as such:
SELECT
DATE_FORMAT(regdate,"%Y-%m-01") as regmonth,
idprovince ,
(SELECT
COUNT(userid)
FROM
users AS counting_0_tbl
WHERE DATE_FORMAT(counting_0_tbl.regdate,"%Y-%m-01")<=DATE_FORMAT(users.regdate,"%Y-%m-01")
AND counting_0_tbl.idprovince = users.idprovince
) as num
FROM
users
GROUP BY
DATE_FORMAT(regdate,"%Y-%m-01"),
idprovince
And IT WORKS PROPERLY, but takes AGES to run, something like 70+ seconds on a 70k lines table.
Any idea of how can I make it more efficient?
I'm increasingly thinking of sticking to the base query and do the cumulation in a second stage, in code...
I'm using Mysql 5.5, but if useful I can upgrade to MySQL 8.
Thanks for any help!