-2

I'm currently working with a Cricket (a sport) dataset where I need to find the accumulated runs scored, balls bowled per year with the years count. Below is an excerpt of the dataset

enter image description here

I'm trying to aggregate as below, but I'm not able to frame the right piece of code for this

enter image description here

Please help

Arun Elangovan
  • 237
  • 1
  • 3
  • 16
  • Welcome to SO. Please review [how to ask](https://stackoverflow.com/help/how-to-ask) questions. SO respondents expect a [minimal reproducible example/attempt](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including sample data. A screenshot is not useful. Please use [edit](https://stackoverflow.com/posts/49555517/edit) to revise your question. – Maurits Evers Mar 29 '18 at 12:15

1 Answers1

0

This should create the data frame you have:

cricket_df <- data.frame(Team1 = rep("DD", 7), Team2 = rep("CSK", 7), year = c(2008, 2008, 2009,2009, 2009, 2010, 2010), balls=c(4,4,2,1,2,3,2), runs=c(4,6,6,0,3,1,8))

And with this you can aggregate:

aggregate(cricket_df[c("balls", "runs")], by=list(cricket_df$Team1, cricket_df$Team2, cricket_df$year), FUN=sum)
Wolfgang Arnold
  • 1,252
  • 8
  • 17
  • But, with this code we could not obtain year_count. How to get that? – Arun Elangovan Mar 29 '18 at 18:20
  • True. There may be more elegant ways, but here's a solution that should work: save the aggregate in a new dataframe: `aggr_df <- aggregate(cricket_df[c("balls", "runs")], by=list(cricket_df$Team1, cricket_df$Team2, cricket_df$year), FUN=sum)` and a column for year count: `aggr_df$year_count <- table(cricket_df$year)` – Wolfgang Arnold Mar 29 '18 at 19:35
  • Thanks it works. But with a dataset as below it dosen't provide the expected results df <- data.frame(Team1 = c("DD","DD","DD","DD","DD","DD","DD","RR","RR","RR","RR","RR","RR","RR"), Team2 = rep("CSK", 14), year = c(2008, 2008, 2009,2009, 2009, 2010, 2010,2008, 2008, 2009,2009, 2009, 2010, 2010), balls=c(4,4,2,1,2,3,2,4,4,2,1,2,3,2), runs=c(4,6,6,0,3,1,8,4,4,2,1,2,3,2)) aggr_df <- aggregate(cricket_df[c("balls", "runs")], by=list(cricket_df$Team1, cricket_df$Team2, cricket_df$year), FUN=sum) aggr_df$year_count <- table(df$year) – Arun Elangovan Mar 30 '18 at 02:33