0

I am trying to consolidate duplicate rows in my dataframe and count the other rows that correspond to duplicates.

Consolidate duplicate rows

This thread was very helpful but when I tried with count instead of sum as the function in the ddply approach I got error: length(rows)==1 is not TRUE.

ACCT_NUM             DC_NUM   INVOICE_NUM       DATE    DC_PROD_  NUM DELIVERED_QUANTITY                                                          
640324     CCF575-000712116         15283   4-May-15      154609    1       29147104
640324     CCF575-000712116         15283   4-May-15      423580    1       29147104
640324     CCF575-000712116         15283   4-May-15      538010    1       29147104
640324     CCF575-000712116         15283   4-May-15      991900    1       29147104
640324     CCF575-000712116         15283   4-May-15      991940    1       29147104
640324     CCF575-000712116         15283   4-May-15      991960    1       29147104
640324     CCF575-000712116         29289   7-May-15      423580    1       29181744
acylam
  • 18,231
  • 5
  • 36
  • 45
kmathers
  • 1
  • 2
  • Try `length` instead of `count`. `plyr::count` is built to work on a whole data frame, not just one column. – Gregor Thomas Oct 05 '17 at 18:06
  • sample data? desired output? – Preston Oct 05 '17 at 18:09
  • Please consider these suggestions when you ask a [question](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – shea Oct 05 '17 at 18:38
  • I can't really provide sample data, I'll try and cook something up for the first page., each column is a list of strings. So length would give me the aggregate length of all those strings? I want the number of strings – kmathers Oct 05 '17 at 18:48
  • Desired output looks like the sample data but consolidated to unique INVOICE_NUM with each other column being a count of how many times an entry occurred, i.e. how many repetitions were there. – kmathers Oct 05 '17 at 19:02
  • Sounds like you want `plyr::count(your_data)`. No `ddply` needed. If that's not what you want, please show the output that corresponds to your sample input. – Gregor Thomas Oct 05 '17 at 21:18

1 Answers1

0

I think that you are looking for dplyr::n() rather than a count function.

With these data:

 df <- data.frame(A = c("A","A","B","B")
                  , B = c("C", "C", "D", "D"))

You can grab the counts like this:

df %>%
  group_by(A, B) %>%
  summarise(Count = n())

which returns:

       A      B Count
  <fctr> <fctr> <int>
1      A      C     2
2      B      D     2
Mark Peterson
  • 9,370
  • 2
  • 25
  • 48