I have a dataframe with columns describing company names (id
), functions (category
), indicators (factors
) and values for these factors. The purpose is to plot several boxplots to show the distribution of factors values by functions.
Data:
structure(list(id = c("Chee, Chelsea", "Chee, Chelsea", "Chee, Chelsea",
"Chee, Chelsea", "Chee, Chelsea", "Chee, Chelsea", "Chee, Chelsea",
"Chee, Chelsea", "Chee, Chelsea", "Chee, Chelsea", "Hatchett, Dante",
"Hatchett, Dante", "Hatchett, Dante", "Hatchett, Dante", "Hatchett, Dante",
"Hatchett, Dante", "Hatchett, Dante", "Hatchett, Dante", "Hatchett, Dante",
"Hatchett, Dante", "Hagemeier, Wilmer", "Hagemeier, Wilmer",
"Hagemeier, Wilmer", "Hagemeier, Wilmer", "Hagemeier, Wilmer",
"Hagemeier, Wilmer", "Hagemeier, Wilmer", "Hagemeier, Wilmer",
"Hagemeier, Wilmer", "Hagemeier, Wilmer", "el-Jabour, Suhaa",
"el-Jabour, Suhaa", "el-Jabour, Suhaa", "el-Jabour, Suhaa", "el-Jabour, Suhaa",
"el-Jabour, Suhaa", "el-Jabour, Suhaa", "el-Jabour, Suhaa", "el-Jabour, Suhaa",
"el-Jabour, Suhaa", "Salihi, Divya", "Salihi, Divya", "Salihi, Divya",
"Salihi, Divya", "Salihi, Divya", "Salihi, Divya", "Salihi, Divya",
"Salihi, Divya", "Salihi, Divya", "Salihi, Divya", "al-Jamil, Jaad",
"al-Jamil, Jaad", "al-Jamil, Jaad", "al-Jamil, Jaad", "al-Jamil, Jaad",
"al-Jamil, Jaad", "al-Jamil, Jaad", "al-Jamil, Jaad", "al-Jamil, Jaad",
"al-Jamil, Jaad", "Porter, Elijah", "Porter, Elijah", "Porter, Elijah",
"Porter, Elijah", "Porter, Elijah", "Porter, Elijah", "Porter, Elijah",
"Porter, Elijah", "Porter, Elijah", "Porter, Elijah", "Ridgley, Matthew",
"Ridgley, Matthew", "Ridgley, Matthew", "Ridgley, Matthew", "Ridgley, Matthew",
"Ridgley, Matthew", "Ridgley, Matthew", "Ridgley, Matthew", "Ridgley, Matthew",
"Ridgley, Matthew", "Oats, Jiair", "Oats, Jiair", "Oats, Jiair",
"Oats, Jiair", "Oats, Jiair", "Oats, Jiair", "Oats, Jiair", "Oats, Jiair",
"Oats, Jiair", "Oats, Jiair", "Thompson, Asien", "Thompson, Asien",
"Thompson, Asien", "Thompson, Asien", "Thompson, Asien", "Thompson, Asien",
"Thompson, Asien", "Thompson, Asien", "Thompson, Asien", "Thompson, Asien"
), category = c("will", "will", "will", "will", "will", "deal",
"deal", "deal", "deal", "deal", "will", "will", "will", "will",
"will", "deal", "deal", "deal", "deal", "deal", "will", "will",
"will", "will", "will", "deal", "deal", "deal", "deal", "deal",
"will", "will", "will", "will", "will", "deal", "deal", "deal",
"deal", "deal", "will", "will", "will", "will", "will", "deal",
"deal", "deal", "deal", "deal", "will", "will", "will", "will",
"will", "deal", "deal", "deal", "deal", "deal", "will", "will",
"will", "will", "will", "deal", "deal", "deal", "deal", "deal",
"will", "will", "will", "will", "will", "deal", "deal", "deal",
"deal", "deal", "will", "will", "will", "will", "will", "deal",
"deal", "deal", "deal", "deal", "will", "will", "will", "will",
"will", "deal", "deal", "deal", "deal", "deal"), factor = c("f1",
"f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2",
"f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3",
"f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4",
"f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5",
"f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1",
"f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2",
"f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3",
"f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4",
"f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5"
), value = c(0.339243657473717, 0.384596983617986, 0.0903604942291727,
0.622299975399853, 0.878426613848986, 0.619932561033423, 0.768372484010595,
1.3720186467304, 0.516137222110122, 0.0939216356224454, 0.423330163104718,
1.09092813025095, 1.19417177287019, 0.719465669220584, 0.452970378504298,
-0.262289594598489, 1.22689933746316, 0.816430627598565, 0.225885114542236,
0.632040744287071, 0.104560237280194, 0.381714309901825, 0.62676961473864,
-0.0497874636348734, 0.950027143102881, 0.770846095346556, 0.148980694426281,
0.0441704598142616, 0.490668306336729, 1.02471661138678, 0.156174816905824,
0.31746617387743, 0.156617889567164, 0.0424322867402526, -0.468906139291209,
0.240259904852959, 0.477319222715837, 0.838721253256597, 0.445074674905288,
0.549554109125289, -0.226713556713281, 0.118250559860738, 0.479740692801046,
0.0787136404239509, -0.796681488556265, 0.191482860752725, 0.28786926088113,
0.87763251227066, 0.0338514723682836, 0.235576477670443, -0.0690121807547427,
-0.268401095627916, 0.525430078156439, -0.292741297006626, 0.204765160519623,
0.332993835314161, 0.410545410766758, 0.686637667590553, 0.149842772573679,
0.700177571955539, 0.945997668337351, 0.32488054941514, 0.993151127821943,
0.524358293364559, 0.743356027756573, 0.0247172637782763, 0.205738918048416,
0.922272051144243, 0.264568168014215, 0.800444985485889, 0.0490291076301935,
-0.182296829387635, 0.275266536310165, 0.723462807292679, 1.37681045703127,
0.996572375062412, 0.78567025822639, 0.852269626584109, -0.257367673879751,
0.998810021760118, 0.90491311313343, 1.33803924723801, 1.44241236118906,
1.20343139126242, 0.666758519859951, 1.0151075718858, 0.820298727592033,
1.26452544892297, 0.937448475295236, 0.363135203972494, 0.633056112436769,
0.965685304671053, 0.640992301458128, -0.083835315236123, 1.14088770490309,
0.402326393668432, 0.117951239403618, 0.403472929718899, 1.32109715429833,
0.937023659882023)), class = "data.frame", row.names = c(NA,
-100L))
I think about automatizing this process. I would like to know how can I:
- Filter my dataframe within a function for each
will
anddeal
; - To make boxplots for factors within each category.
I tried to write a lambda function but did not understand indexing and how to filter tha abstract dataframe which we define in our function. Conceptually, I understand that I am supposed to do something like that:
plots_fun <- function(dataframe){
a <- ggplot(data = dataframe[,1], ...)
}
Also, I thought about using lapply
... But my first step is to write the function -- actually, what I am struggling with.
In the case of my sample data, the desirable output is two plots - for will
and deal
:
ggplot(data = sample_data %>% filter(category == "will"),
aes(factor, value)) +
geom_boxplot()
ggplot(data = sample_data %>% filter(category == "deal"),
aes(factor, value)) +
geom_boxplot()