I'm trying to create a function that bins data based upon multiple conditions. My data has two variables: max_dist
and activated.
The function should create multiple vectors for the different bins; check whether the max_dist
falls within a specific range and then append a 1
to the vector if it falls within the range and activated
is TRUE
or a 0
to the list if activated
is FALSE
.
The key part is that for each observation, if the max_dist is greater than the specified range but activated
is also TRUE
, then I would like to include within that bin a 0. So some observations with high max_dist
values will be binned multiple times.
Currently I have structured it like this (shortened version - full length there are 6 bins):
binning_function <- function(df) {
#create a series of vectors corresponding to bins
two_hundred <- c()
four_hundred <- c()
#iterate through dataframe to add 0 or 1 values to each vector
for (i in 1:nrow(df)) {
if (df$activated[i]==TRUE && df$max_dist[i]<=0.2) {
append(two_hundred, 1)
}
else if (df$max_dist[i]>0.2 || df$activated[i]==FALSE) {
append(two_hundred, 0)
}
}
for (i in 1:nrow(df)) {
if (df$activated[i]==TRUE && df$max_dist[i]>0.2 && df$max_dist[i]<=0.4) {
append(four_hundred, 1)
}
else if (df$max_dist[i]>0.4 || df$activated[i]==FALSE) {
append(four_hundred, 0)
}
}
return(list(two_hundred,four_hundred))
}
When I run this function on a dataframe it returns a list:
[[1]]
NULL
[[2]]
NULL