I want to summarise my data small
for each different video.id using dplyr
.
small %>%
group_by(Video.ID) %>%
summarise(sumr = sum(Partner.Revenue),
len = mean(Video.Duration..sec.),
cat = mean(Category))
mean(Category) is clearly the wrong approach. How do I get it just to use the value that is repeated several times (one video.id has always the same category no matter how often it appears in the dataframe).
My dataframe looks like this :
small
# A tibble: 6 x 7
X1 X1_1 Video.ID Video.Duration..sec. Category Owned.Views Partner.Revenue
<int> <int> <chr> <int> <chr> <int> <dbl>
1 1 1 ---0zh9uzSE 1184 gadgets 6 0
2 2 2 ---0zh9uzSE 1184 gadgets 6 0
3 3 3 ---0zh9uzSE 1184 gadgets 2 0
4 4 4 ---0zh9uzSE 1184 gadgets 1 0
5 5 5 ---0zh9uzSE 1184 gadgets 1 0
6 6 6 ---0zh9uzSE 1184 gadgets 3 0
small <-
structure(list(X1 = 1:6,
X1_1 = 1:6,
Video.ID = c("---0zh9uzSE", "---0zh9uzSE", "---0zh9uzSE", "---0zh9uzSE", "---0zh9uzSE", "---0zh9uzSE"),
Video.Duration..sec. = c(1184L, 1184L, 1184L, 1184L, 1184L, 1184L),
Category = c("gadgets", "gadgets", "gadgets", "gadgets", "gadgets", "gadgets"),
Owned.Views = c(6L, 6L, 2L, 1L, 1L, 3L),
Partner.Revenue = c(0, 0, 0, 0, 0, 0)),
row.names = c(NA, -6L),
class = c("tbl_df", "tbl", "data.frame"))