I have a question about the binning of continuous variables by cut2
from Hmisc
in R
and why in some cases the cutpoints are not treated the same.
I am aware of this question, especially the answer from Christopher Bottoms, but it doesnt address my question as to why cut2 doesnt seem to respect the cutpoints supplied to it in certain circumstances.
Given
v<-seq(1:12)
v
I want to supply a list of cutpoints (a,b,c..,y,z) and have a numeric variable binned into the ranges [-Inf,b), [b,c),...[y,Inf]
This seems to work fine.
cuts<-cut2(v,g = 4,onlycuts = TRUE)
cuts[1]<- -Inf
cuts[length(cuts)]<- Inf
cuts
> cuts
[1] -Inf 4 7 10 Inf
table(cut2(v,cuts = cuts))
> table(cut2(v,cuts = cuts))
[-Inf, 4) [ 4, 7) [ 7, 10) [ 10, Inf]
3 3 3 3
But not this. How to accomplish a binning based on a user defined rule like this?
cuts<-cut2(v,g = 7,onlycuts = TRUE)
cuts[1]<- -Inf
cuts[length(cuts)]<- Inf
cuts
> cuts
[1] -Inf 3 5 7 8 10 Inf
table(cut2(v,cuts = cuts))
> table(cut2(v,cuts = cuts))
[-Inf, 3) [ 3, 5) [ 5, 7) 7 [ 8, 10) [ 10, Inf]
2 2 2 1 2 3