I have a dataframe punkt_tabelle
containing points made in a game. Each game has 2 or 3 sets (3 in the MRE). The dataframe contains how the points were made. I also have the scores at the end of the set, which is stored in scores
.
I calculate the sum for each team in each set. (I did this in total_pts
).
What I am trying to achieve is to compare the sum of the points from the datatable (per team and per set) with the points this team made according to the scores
. If scores
in this set is bigger than the sum calculated as total
then I want to add an extra row to the datatable. This new row should contain the teamname, the set and the skill for this new row should be "Opp. Other Errors" and the value for Pkt
would be the difference between scores
and total
. It maybe (and is the case in the MRE), that a new row has to be added for every team and every set.
If you would re-run the total_pts calculation after the new rows were added, it would therefore equal the results in scores
.
I tried variations of the code below according to these questions and articles (R Conditional evaluation when using the pipe operator %>%, Inserting a new row to data frame for each group id), but could not work out a solution for my problem.
Here is the last version of my code:
library(dplyr)
library (devtools)
punkt_tabelle <- structure(list(Team = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("Miller/Myer", "Winter/Summer"), class = "factor"),
Skill = structure(c(1L, 1L, 3L, 2L, 2L, 2L, 1L, 1L, 3L, 2L,
2L, 2L, 4L, 4L, 5L, 6L, 6L, 6L, 4L, 4L, 5L, 6L, 6L, 6L), .Label = c("Attack",
"Service", "Block", "Opp. Attack Error", "Opp. Block Error",
"Opp. Serve Error"), class = "factor"), Set = c(2L, 3L, 2L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L), Pkt = c(2L, 1L, 1L, 0L, 0L, 0L,
3L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L,
0L, 1L, 0L)), row.names = c(NA, -24L), vars = c("Team", "Skill"
), indices = list(0:1, 2L, 18:19, 20L, 21:23, 3:5, 6:7, 8L, 12:13,
14L, 15:17, 9:11), group_sizes = c(2L, 1L, 2L, 1L, 3L, 3L,
2L, 1L, 2L, 1L, 3L, 3L), biggest_group_size = 3L, labels = structure(list(
Team = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L), .Label = c("Miller/Myer", "Winter/Summer"), class = "factor"),
Skill = c("Attack", "Block", "Opp. Attack Error", "Opp. Block Error",
"Opp. Serve Error", "Service", "Attack", "Block", "Opp. Attack Error",
"Opp. Block Error", "Opp. Serve Error", "Service")), row.names = c(NA,
-12L), class = "data.frame", vars = c("Team", "Skill")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
score_miller_myer <- c(3,6,3) #total points in sets 1, 2, 3
score_winter_summer <- c(5,4,5)
scores <- c(score_miller_myer, score_winter_summer)
#calculate the sum of the points per team and per set
total_pts <- punkt_tabelle %>% group_by(Team, Set) %>% summarize(total = sum(Pkt))
total_pts
#try to compare with the score and add en entry in the dataframe
punkt_tabelle %>%
group_by (Team, Set) %>%
mutate(total = sum(Pkt)) %>%
{if (total<scores) dplyr::bind_rows(Team=Team, Set=Set, Skill="Opp. Other Error", Pkt=(total-scores))}
punkt_tabelle
Can this somehow be done like this? Or do I need to use a loop and do that manually for each set and team? Please help!
EDIT: The expected output in this example would look like this:
Team Skill Set Pkt
<fct> <fct> <int> <int>
1 Miller/Myer Attack 2 2
2 Miller/Myer Attack 3 1
3 Miller/Myer Block 2 1
4 Miller/Myer Service 1 0
5 Miller/Myer Service 2 0
6 Miller/Myer Service 3 0
7 Winter/Summer Attack 1 3
8 Winter/Summer Attack 2 1
9 Winter/Summer Block 3 1
10 Winter/Summer Service 1 0
11 Winter/Summer Service 2 1
12 Winter/Summer Service 3 1
13 Winter/Summer Opp. Attack Error 2 0
14 Winter/Summer Opp. Attack Error 3 0
15 Winter/Summer Opp. Block Error 2 0
16 Winter/Summer Opp. Serve Error 1 0
17 Winter/Summer Opp. Serve Error 2 1
18 Winter/Summer Opp. Serve Error 3 1
19 Miller/Myer Opp. Attack Error 1 1
20 Miller/Myer Opp. Attack Error 2 0
21 Miller/Myer Opp. Block Error 3 0
22 Miller/Myer Opp. Serve Error 1 0
23 Miller/Myer Opp. Serve Error 2 1
24 Miller/Myer Opp. Serve Error 3 0
25 Winter/Summer Opp. Other Error 1 2 #here start the added rows
26 Winter/Summer Opp. Other Error 2 1
27 Winter/Summer Opp. Other Error 3 2
28 Miller/Myer Opp. Other Error 1 2
29 Miller/Myer Opp. Other Error 2 2
30 Miller/Myer Opp. Other Error 3 2
Further explanation of the problem: A team scores in various ways. Either they score themselves (Attack, Serve, Block) or their opponent makes a mistake (Opp. Attack Error, Opp. Serve Error, Opp. Block Error). Still that leaves some difference to the total score they reach, because there are some errors of the opponents which are not specified. For this I want to add a row "Opp. Other errors" after calculating the difference.
Example: in row 26: the value of Pkt is 1 because in total_pts
in Set 2 team Winter/Summer has 3 points. But their score according to score_winter_summer
in set 2 is 4 points. So there is a difference of 1 point which is added in the new row.