0

In R:

I am not sure what the proper title for this question is, so maybe someone can help me out. It would be greatly appreciated. I'm sorry if this is called something easily searchable.

So I have a ragged array matrix (multiple UPCS)

     [upc]    [quantity]          [day]    [daysum]
[1]  123         11                 1         NA     
[2]  123          2                 1         NA 
[3]  789          5                 1         NA 
[4]  456         10                 2         NA 
[5]  789          6                 2         NA 

I want the matrix to be summed by UPC for each day, for example:

    [upc]    [quantity]          [day]    [daysum]
[1]  123         11                 1         13
[2]  123          2                 1         13
[3]  789          5                 1         5
[4]  456         10                 2         10
[5]  789          6                 2         6

Thank you for your time and help.

wolfsatthedoor
  • 7,163
  • 18
  • 46
  • 90
  • Sorry, I apologize, you mean to provide the code for the matrices so users can work with them to see if their answer works, correct? – wolfsatthedoor Sep 04 '13 at 05:16
  • 1
    Yes. At least a minimal example. See, for example [the suggestions here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). However, by the looks of it, @DWin has you covered with filling in your "long" matrix with the sum by "upc" and "day". – A5C1D2H2I1M1N2O1R2T1 Sep 04 '13 at 05:19

1 Answers1

1

You have not described what is supposed to happen with the "clean matrix" but the code to create a "column" from your larger matrix suitable for binding to it on a row-aligned basis is quite simple:

 B <- cbind(B, daysum=ave(B[, 'quantity'],   # analysis variable
                          B[, 'upc'], B[ , 'day'], # strata variables
                          FUN=sum)  )     # function applied in strata

This of course assumes that B really has the column names as indicated. Should also work if it is actually a dataframe, although the output does not suggest that you actually have R objects yet. The ave function will replicate the sums for all the rows with the same stratification variables.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thank you, this is an incredibly useful function. I will now post better vis a vis the reproducible requirements and such. – wolfsatthedoor Sep 04 '13 at 05:25
  • 2
    Yes to "incredibly useful". I have in the past suggested that `ave` needs a better PR firm representing its standing in the R world. – IRTFM Sep 04 '13 at 05:29
  • I have used this literally hundreds of times since I asked this. Thank you so much again, and I hope the PR takes off – wolfsatthedoor Mar 20 '14 at 16:43
  • Do you know how to do this without looping where you want to create multiple columns, all similarly from summing over one analysis variable at a time? (All with same strata variables.) For example, If I had multiple quantity colmns, q1-q20, and I wanted 20 sum columns sum1-sum20. – wolfsatthedoor Mar 29 '14 at 20:23
  • I'm not exactly sure I understand. Post a question with an example with , say, 5 rows and three columns. – IRTFM Mar 29 '14 at 21:58