in (R) for an event study I'm trying to create a column that calculates the mean of ccu_avg for a specific combination of appid and Eventdate1. One appid has multiple events so it has to be divided by both appid and Eventdate1. The difficult thing here is that I want it to calculate the mean only up until the event date since after the event happened the estimation period stops
The new column should look like est_ccu_avg: picture of the dataset below for explanation
https://i.stack.imgur.com/ZPquW.png
Could someone help me figure the code for this out? I've been trying for hours and can't seem to get it to work.
I've now been trying things like this but without success:
study <- study %>%
mutate(est_ccu_avg=
mean(study[unique(study$appid) | study$Eventdate1 >
study$datefinal, "ccu_avg"])
)
Result of dput head:
structure(list(appid = c("105600", "105600", "105600", "105600",
"105600", "105600"), name = c("Terraria", "Terraria", "Terraria",
"Terraria", "Terraria", "Terraria"), ccu_avg = c(26825, 29058,
37842, 37525, 26484, 24377), ccu_min = c(21176, 21620, 28954,
32880, 19648, 19118), ccu_max = c(35827, 41322, 50012, 44071,
33241, 32060), pos_max = c(356186, 356363, 356508, 356712, 356921,
357092), neg_max = c(6756, 6756, 6758, 6768, 6766, 6768), Maj_Upt =
c(0,
0, 0, 0, 0, 0), Min_Upt = c(0, 0, 0, 0, 0, 0), Hotfix = c(0,
0, 0, 0, 0, 0), Bugfix = c(0, 0, 0, 0, 0, 0), Balance = c(0,
0, 0, 0, 0, 0), ExpBranch = c(0, 0, 0, 0, 0, 0), Promo = c(0,
1, 0, 0, 0, 0), Ev_Out = c(0, 0, 0, 0, 0, 0), Ev_In = c(0, 0,
0, 0, 0, 0), isfree = c(0, 0, 0, 0, 0, 0), developers1 = c("Re-
Logic",
"Re-Logic", "Re-Logic", "Re-Logic", "Re-Logic", "Re-Logic"),
publishers1 = c("Re-Logic", "Re-Logic", "Re-Logic", "Re-Logic",
"Re-Logic", "Re-Logic"), metascore = c(83, 83, 83, 83, 83,
83), singleplayer = c(1, 1, 1, 1, 1, 1), multiplayer = c(1,
1, 1, 1, 1, 1), coop = c(1, 1, 1, 1, 1, 1), mmo = c(0, 0,
0, 0, 0, 0), indie = c(1, 1, 1, 1, 1, 1), single_player_gen = c(0,
0, 0, 0, 0, 0), adventure = c(1, 1, 1, 1, 1, 1), casual = c(0,
0, 0, 0, 0, 0), strategy = c(0, 0, 0, 0, 0, 0), rpg = c(1,
1, 1, 1, 1, 1), simulation = c(0, 0, 0, 0, 0, 0), multi_player_gen =
c(0,
0, 0, 0, 0, 0), shooter = c(0, 0, 0, 0, 0, 0), platformer = c(0,
0, 0, 0, 0, 0), ea_min = c(0, 0, 0, 0, 0, 0), ea_max = c(0,
0, 0, 0, 0, 0), scifi = c(0, 0, 0, 0, 0, 0), sports = c(0,
0, 0, 0, 0, 0), racing = c(0, 0, 0, 0, 0, 0), inappurchase = c(0,
0, 0, 0, 0, 0), workshop = c(0, 0, 0, 0, 0, 0), f_release_date =
c("May 16, 2011",
"May 16, 2011", "May 16, 2011", "May 16, 2011", "May 16, 2011",
"May 16, 2011"), l_release_date = c("May 16, 2011", "May 16, 2011",
"May 16, 2011", "May 16, 2011", "May 16, 2011", "May 16, 2011"
), datefinal = structure(c(18942, 18943, 18944, 18945, 18946,
18947), class = "Date"), Eventdate = c("", "", "", "", "",
""), Eventdate1 = structure(c(18949, 18949, 18949, 18949,
18949, 18949), class = "Date"), est_ccu_avg = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))