-4

I have a dataset of online purchases from one site. Each row represents different item packed, but it does not necessarly represent a seperate order. I would like to know how many different items were packed in one parcel. A variable order_code reperesents a specific order.

I am wondering how can I count the rows that contain the same order_code --> which would directly correspond to how much of the items I have per order.

data$result <- group_by(data,order_code)

this does not return the desired outcome...

the data and the final outut should looke like the table below:

order_code  date         desired output
302492016   2016-07-01  
302492016   2016-07-01    2
302502016   2016-07-01  
302502016   2016-07-01    2
302512016   2016-07-01  
302512016   2016-07-01    2
302522016   2016-07-01    1
302532016   2016-07-01  
302532016   2016-07-01    2
  • 1
    Please provide reproducible example along with your desired output. – 989 Oct 10 '16 at 12:32
  • you need to summarise the data. Perhaps `data %>% group_by(order_code)%>%summarise(n=n())` but would be easier to tell with a sample of data – Richard Telford Oct 10 '16 at 12:32
  • I have added the example of my table in the question I if helps... – Andraž Poje Oct 10 '16 at 12:56
  • 3
    Probably this http://stackoverflow.com/questions/7450600/count-number-of-observations-rows-per-group-and-add-result-to-data-frame. And I really have no idea why did you think your attempt will work. You should start with the basics. You can't even distinguish base R from an external package, not to mention how to use that package. – David Arenburg Oct 10 '16 at 13:06

1 Answers1

0

With the following example data:

> df
  order_code       date
1  302492016 2016-07-01
2  302492016 2016-07-01
3  302502016 2016-07-01
4  302502016 2016-07-01
5  302512016 2016-07-01
6  302512016 2016-07-01
7  302522016 2016-07-01
8  302532016 2016-07-01
9  302532016 2016-07-01

Count frequency of order_code:

> library(plyr)
> freq <- count(df, 'order_code')

> freq
  order_code freq
1  302492016    2
2  302502016    2
3  302512016    2
4  302522016    1
5  302532016    2

Merge with the original dataframe:

> df2 <- data.frame(merge(df,freq, by="order_code", all=TRUE))

> df2
  order_code       date freq
1  302492016 2016-07-01    2
2  302492016 2016-07-01    2
3  302502016 2016-07-01    2
4  302502016 2016-07-01    2
5  302512016 2016-07-01    2
6  302512016 2016-07-01    2
7  302522016 2016-07-01    1
8  302532016 2016-07-01    2
9  302532016 2016-07-01    2
Shearn
  • 33
  • 4