0

I want to make various factor columns on my dataframe based on this column: (real file is large, is not ordered and values have more decimals)

> df

0.05
0.1
0.15
0.20
0.25
0.30
.
.
.
0.90
0.95
0.99

The values range from 0.05 to 0.99 and I want to make a factor columns in 0.1 and 0.05 bins and possibly others as well. I tried using the ifelse function like this:

df$bin1 <- ifelse(df$V1 < 0.1, 1 , ifelse(0.1 <= df$V1 & df$V1 < 0.2,2,ifelse(...))

It worked but the command was large and very cumbersome for the other bins that I want to use.

Javier2013
  • 483
  • 1
  • 5
  • 13
  • The answer to [this](http://stackoverflow.com/questions/27902821/create-column-with-grouped-values-based-on-another-column-in-dplyr/27902907?noredirect=1#comment44206402_27902907) question should be helpful in learning how to use the `cut` function to group the data. – Andrew Taylor Jan 23 '15 at 12:35
  • Or: http://stackoverflow.com/questions/4126326/how-to-quickly-form-groups-quartiles-deciles-etc-by-ordering-columns-in-a/4126475#4126475, http://stackoverflow.com/questions/11950923/refactor-data-frame-column-values/11951058#11951058 – Henrik Jan 23 '15 at 12:35

1 Answers1

0

One way to do this is with hist() function.

Like this:

hist((1:10), breaks=c(0, 2.5, 5.5, 12))

You can set your own breakpoints and it bins the results. If you want to do it without plotting, then something like this should work:

df$bin <- hist(df$V1, breaks=c(0, 0.05, 0.1, 0.2, 1), plot = FALSE)
LauriK
  • 1,899
  • 15
  • 20