4

I can group users by count in each month like this:

User.group('year(created_at)').group('month(created_at)').count
#=> {[2015, 4]=>90, [2015, 5]=>133, [2015, 6]=>131, [2015, 7]=>28, [2015, 8]=>45, [2015, 9]=>6}

I want to create stats, showing how users count was growing by months.

So it would return something like this:

{[2015, 4]=>20, [2015, 5]=>40, [2015, 6]=>55, [2015, 7]=>70, [2015, 8]=>100, [2015, 9]=>130}
# each entry is a year, month and total users count from the beginning of time by the end of this month.

How would I get the wanted result?

I am looking for a database level solution (ActiveRecord query) if possible.

Any suggestions are appreciated.

Andrey Deineko
  • 51,333
  • 10
  • 112
  • 145
  • Please edit to explain how `[2015,4]=>20` and `[2015,5]=>40` were computed. It would also be helpful to assign a variable to the input hash (e.g., `h = User.group(....`), so that those giving answers can refer to the variable without having to define it. – Cary Swoveland Sep 11 '15 at 15:10
  • @CarySwoveland As I mentioned in the comment of desired output, the `[2015,4]=>20` means that in 4th month of year 2015 (the beginning of the application existence) 20 users registered in the application, `[2015,5]=>40` means, that from the beginning of times (4th month 2015) until end of 5th month 40 users registered, `[2015, 6]=>55` means, that from beginning of times unitl end of 6th month of 2015 55 users registered and so on, so that i have exponential statistics of users count in the application by months. – Andrey Deineko Sep 11 '15 at 16:06
  • Andrey, that's what I figured (cumulative users by month), but I don't know where the values of the desired output come from. For example, how did you get `20` from `90`? Also, I don't know what you mean by "exponential" in the title or why all the values in the desired output are multiples of `5` (if not coincidence). – Cary Swoveland Sep 11 '15 at 16:21
  • The collection to operate on is `User::ActiveRecord_Relation`. I am afraid I can't provide you with the entire collection. Edited the title with cumulative (the word I was looking for, thanks!) regarding multiples of `5` - coincidence :) – Andrey Deineko Sep 11 '15 at 16:28
  • Possible duplicate: http://stackoverflow.com/questions/17664436/cumulative-sum-over-a-set-of-rows-in-mysql – Alexey Shein Sep 11 '15 at 17:08

2 Answers2

0

So, correct me if I misunderstood anything. You have user registrations by months, i.e. 1 user in Jan, 2 users in Feb and 3 in March and you want to build a chart with user growth, so you want 1 user in Jan, 3 users in Feb and 6 in March. If that's the case, you can do the following:

counts = {[2015, 4]=>1, [2015, 5]=>2, [2015, 6]=>3, [2015, 7]=>4, [2015, 8]=>5, [2015, 9]=>6}     

keys = counts.keys # save keys => [[2015, 4], [2015, 5], [2015, 6], [2015, 7], [2015, 8], [2015, 9]]

acc_values = counts.values.dup # .dup is needed if you don't want to spoil original values  
# because of for the following map! operation, 
# right now acc_values returns => [1, 2, 3, 4, 5, 6]

# use map! instead of map so we have access to already changed items, 
# while we iterating over the next
acc_values.map!.with_index { |val, key| key.zero? ? val : acc_values[key - 1] + val }    
# => [1, 3, 6, 10, 15, 21]

# build resulting hash
acc_counts = keys.zip(acc_values).to_h    
# => {[2015, 4]=>1, [2015, 5]=>3, [2015, 6]=>6, [2015, 7]=>10, [2015, 8]=>15, [2015, 9]=>21}
Alexey Shein
  • 7,342
  • 1
  • 25
  • 38
  • Unfortunately, this is not the case - I don't have counts, I am aiming to get those. Please, read the discussion under my question, it could be helpful in clarifying what I need. – Andrey Deineko Sep 11 '15 at 16:55
  • @AndreyDeineko Oh, it seems you need the backwards operation, right? So, you have 1 in Jan, 3 in Feb, and 6 in March and you want them back to 1, 2, 3? – Alexey Shein Sep 11 '15 at 16:59
  • You can use your code `User.group('year(created_at)').group('month(created_at)').count` to get raw counts by months, and them my code to convert them to cumulative counts. If you want a complete SQL solution you'll have to resort to using variables which can lead to a quite ugly solution, like this: http://stackoverflow.com/questions/17664436/cumulative-sum-over-a-set-of-rows-in-mysql – Alexey Shein Sep 11 '15 at 17:07
0

If g is a given hash where g[[y,m]] is the number of new users in year y, month m and you want to create a hash h for which h[[y,m]] is the cumulative number of users in year y, month m, rounded to the nearest 5, h can be computed as follows:

g = {[2015, 4]=>90, [2015, 5]=>133, [2015, 6]=>131, [2015, 7]=>28,
     [2015, 8]=>45, [2015, 9]=>6}

h = g.keys.sort.each_with_object(Hash.new(0)) {|k,h| h[k]=g[k]+ h[previous(k)]}
  #=> {[2015, 4]=>90, [2015, 5]=>223, [2015, 6]=>354, [2015, 7]=>382,
  #    [2015, 8]=>427, [2015, 9]=>433} 
h.update(h) { |_,v| round_to_nearest(v, 5) }
  #=> {[2015, 4]=>90, [2015, 5]=>225, [2015, 6]=>355, [2015, 7]=>380,
  # [2015, 8]=>425, [2015, 9]=>435} 

 def previous((year, month))
  month -= 1
  (year -=1; month = 12) if month.zero?
  [year, month]
end

def round_to_nearest(n, d)
  d*(n/d.to_f).round
end

Please let me know if I've misunderstood the question (e.g., perhaps you intended to do this in the database). If so, I'll amend or delete my answer.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • thanks a lot! you've got the idea right, but the issue is that I can't define methods here. And yeah, I was planning to do it on the database level (ActiveRecord query). – Andrey Deineko Sep 11 '15 at 18:34
  • Could this be translated to an `ActiveRecord` query? If so, consider adding an answer yourself. (I'm not familiar with `ActiveRecord`.) – Cary Swoveland Sep 11 '15 at 19:55