2

I have a table (we're on InfoBright columnar storage and I use MySQL Workbench as my interface) that essentially tracks users and a count of activities with a datestamp. It's a daily aggregate table. Schema is essentially

userid (int)
activity_count (int)
date (date)

What I'm trying to find is how many of my users are churning from month to month, with a basis of an active user defined as one with a monthly activity count that sums up to > 10

To find how many users are active in a given month I am currently using

select year, month, count(distinct user) as users
from
(
select YEAR(date) as year, MONTH(date) as month, userid as user, sum(activity_count) as activity
from table
group by YEAR(date), MONTH(date), userid
having activity > 10
order by YEAR(date), MONTH(date)
) t1
group by year, month

Not being a SQL expert, I am sure this can be improved and would appreciate the input on that.

My bigger goal though is to figure out from month to month, how many of the users who are in this count are new or repeat from the previous month. I don't know how to do that without what feels like ugly nesting or joining, and I feel like it should be fairly simple.

Thanks in advance.

Java Devil
  • 10,629
  • 7
  • 33
  • 48

1 Answers1

0

I think that further nesting is the best way to achieve this. I would look to do something like selecting the user for the min concatenated Year & Month as a middle layer to the above (i.e. between outer and inner queries) so that you can establish the first month that the user became active. You can then add a where clause to the outer query to filter so that only the months you require are showing. Let me know if you need help with the syntax.

ChrisProsser
  • 12,598
  • 6
  • 35
  • 44
  • thanks - was hoping to find a 'better' way than nesting, but perhaps not! – no_longer_a_techie Jun 25 '13 at 00:20
  • There is nothing wrong with nesting, it's the most efficient way to run a whole lot of queries. SQL has relatively few keywords (which I personally see as a strength because I can actually remember them). One side effect of this is there are not many ways to flatten out queries with one liners as you might do in some scripting languages. – ChrisProsser Jun 25 '13 at 07:13