1

I have an easy question to figure out:

value
1000
2500
5080
10009

I want to specify value to an interval:

value    Range
1000     0-1000
2500     1001-5000
5080     5001-10000
10009    10001-20000

I try this:

dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value < 5001, "1001-5000", ifelse(5000 < value < 10001, "5001-10000", "10001-20000")))

However, I got Error: unexpected '<' in "dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value <"

Any help?

EDIT:

This question is not asking for the best way to convert a continuous variable to a factor. It is asking for debugging help with the reproducible example:

library(data.table)
dt <- data.table(value = c(1000, 2500, 5080, 10009))
dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value < 5001, "1001-5000", ifelse(5000 < value < 10001, "5001-10000", "10001-20000")))
# produces the error above
De Novo
  • 7,120
  • 1
  • 23
  • 39
Peter Chen
  • 1,464
  • 3
  • 21
  • 48
  • 3
    See `help("cut")` for a better solution than nested `ifelse`. – Roland Apr 03 '18 at 07:48
  • And because you are using data.table: `ifelse` is slow. – Roland Apr 03 '18 at 07:50
  • 1
    Possible use of `cut` : https://stackoverflow.com/questions/13559076/convert-continuous-numeric-values-to-discrete-categories-defined-by-intervals – Ronak Shah Apr 03 '18 at 07:51
  • 1
    Isn't your problem the line `ifelse(1000 < value < 5001,....` as noted in the answer below? R does not take two-way inequalities. You need to break it down – Sotos Apr 03 '18 at 08:50
  • 2
    To those voting to close due to a typo. It's not a typo. It's a non-obvious syntax error that programmers coming from other languages are likely to make. I'd say that makes it a useful question (and answer) for other users. – De Novo Apr 03 '18 at 09:39

1 Answers1

4

Like many (some) errors, it means what it says. Unlike python, R can't interpret 1000 < value < 5001. Instead you need to use 1000 < value & value < 5001

library(data.table)
dt <- data.table(value = c(1000, 2500, 5080, 10009))
dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value & value < 5001, "1001-5000", ifelse(5000 < value & value < 10001, "5001-10000", "10001-20000")))]
dt
   value       Range
1:  1000      0-1000
2:  2500   1001-5000
3:  5080  5001-10000
4: 10009 10001-20000

As @akrun mentioned, you may be better off with a factor. Here's an example:

dt[, Range := cut(value, breaks = c(0, 1001, 5001, 10001, 20001), labels = c("0-1000", "1001-5000", "5001-10000", "10001-20000"))]

This produces a data.table that displays the same way, but extracting the Range column will give you a factor corresponding to the ranges.

De Novo
  • 7,120
  • 1
  • 23
  • 39
  • thanks. your right. I am learning Python recently and I confused with R that I already know. Apprecate again. – Peter Chen Apr 03 '18 at 07:42
  • 7
    Instead of multiple `ifelse`, you can use `cut` or `findInterval` – akrun Apr 03 '18 at 07:49
  • `cut` is great! maybe you can add a Python way to do this. I think some programmers may meet this kind of issue when they use different languages. – Peter Chen Apr 09 '18 at 02:06