For a sample dataframe:
df1 <- structure(list(place = c("a", "a", "b", "b", "b", "b", "c", "c",
"c", "d", "d"), animal = c("cat", "bear", "cat", "bear", "pig",
"goat", "cat", "bear", "goat", "goat", "bear"), number = c(5,
6, 7, 4, 5, 6, 8, 5, 3, 7, 4)), .Names = c("place", "animal",
"number"), row.names = c(NA, -11L), spec = structure(list(cols = structure(list(
place = structure(list(), class = c("collector_character",
"collector")), animal = structure(list(), class = c("collector_character",
"collector")), number = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("place", "animal", "number")),
default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"), class = c("tbl_df",
"tbl", "data.frame"))
I want to create a variable 'sum' which sums the 'number' column by 'place' (regardless of animal), and adds it to the datafame.
The command below:
df1$sum <- aggregate(df1$number, by=list(Category=df1$place), FUN=sum)
... tries to do the sum but can't complete the function because it wants to report by only the number of individual places (hence why we get this error):
Error in `$<-.data.frame`(`*tmp*`, sum, value = list(Category = c("a", :
replacement has 4 rows, data has 11
Any ideas how I add this extra column onto my dataframe?