1

I have a dataframe which looks like the following:

ID    month    country   count    style
1     2012-02  UK        3        high
1     2012-02  US        10       high
1     2012-02  FR        5        high
etc

Now, I want to aggregate the values over the ID and country variable and, therefore, I use:

aggregated_data = setDT(subset)[, .(Country = list(Country), ID = min(ID), 
count = sum(count), by = list(Model, Month)][]

To get

ID    month    country     count    
1     2012-02  UK, US, FR   18      
etc

However, since my style variable is a factor I do not know how I can incorporate this in the aggregated table. The values of the factor variable is always the same for one ID, so I only need to print the first value of the style variable for the style variable in the aggregated table. Does anyone know how to do this?

jay.sf
  • 60,139
  • 8
  • 53
  • 110
Activation
  • 93
  • 6
  • Your expected output is ambiguous, what exactly is counted in your resulting table? BTW, have you ever concerned about how to [properly provide data on Stack Overflow?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). You could do this in a better way than you've done before. – jay.sf Feb 18 '19 at 09:56

1 Answers1

1

You could just use unique, e.g.

df <- setDT(df)
df[, .(country = toString(country), count = sum(count), style = unique(style)), by = list(ID, month)]
#   ID   month    country count style
#1:  1 2012-02 UK, US, FR    18  high

Or using dplyr

df %>%
    group_by(ID, month) %>%
    summarise(
        country = toString(country),
        count = sum(count),
        style = unique(style))
## A tibble: 1 x 5
## Groups:   ID [?]
#     ID month   country    count style
#  <int> <fct>   <chr>      <int> <fct>
#1     1 2012-02 UK, US, FR    18 high

Both methods assume that style is always the same per ID and month.


Sample data

df <- read.table(text =
    "ID    month    country   count    style
1     2012-02  UK        3        high
1     2012-02  US        10       high
1     2012-02  FR        5        high", header = T)
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68