Hope my title makes sense. I have a dataframe with a column of numeric values, and I would like to use this column to create a new column whereby the numeric values are 'mapped' to different buckets based on their values. Below is some test data, as well as a rough-around-the-edges nested ifelse() approach that I am currently using to solve this problem. I am hoping to code this in a better way that doesn't involve nested ifelse() statements, since this approach doesn't scale well for many buckets:
mydf = data.frame(strings = letters[1:10],
numerics = c(0.2, 0.4, 1.3, 5.2, 3.3, 2.1, 7.3, 1.1, 4.3, 8.3),
stringsAsFactors = FALSE)
Here is my test dataframe, and here is my nested ifelse() approach to solving my problem:
mydf$buckets = ifelse(mydf$numerics <= 2, 0,
ifelse(mydf$numerics <= 4, 1,
ifelse(mydf$numerics <= 5, 2,
ifelse(mydf$numerics <= 7, 3, 4))))
What the above code does is maps values in the numeric column as follows:
- all values <2 go to 0
- all values <4 go to 1
- all values <5 go to 2
- all values <7 go to 3
- all values >= 7 to go 4
this approach doesn't scale well for more than a small number of buckets. any help with this is appreciated! Thanks,