0

I am new to R, and I'd like help in finding a better way to write the following code I've written. Any help would be appreciated.

df$rank[between(df$score,0,1.2)] <- 1
df$rank[between(df$score,1.2,2.1)] <- 2
df$rank[between(df$score,2.1,2.9)] <- 3
df$rank[between(df$score,2.9,3.7)] <- 4
df$rank[between(df$score,3.7,4.5)] <- 5
df$rank[between(df$score,4.5,5.4)] <- 6
cmaher
  • 5,100
  • 1
  • 22
  • 34
Lonewolf
  • 197
  • 2
  • 13
  • 1
    Try `if_else` or `case_when` from `dplyr` package – Tung Apr 29 '18 at 18:37
  • 3
    If you're partitioning df$score you could try `df$rank <- as.numeric(cut(df$score, breaks = c(0, 1.2, ...), include.lowest = TRUE))` – Russ Hyde Apr 29 '18 at 18:46
  • 1
    [Convert continuous numeric values to discrete categories defined by intervals](https://stackoverflow.com/questions/13559076/convert-continuous-numeric-values-to-discrete-categories-defined-by-intervals) – Henrik Apr 29 '18 at 18:54
  • Possible duplicate of [Convert continuous numeric values to discrete categories defined by intervals](https://stackoverflow.com/questions/13559076/convert-continuous-numeric-values-to-discrete-categories-defined-by-intervals) – denis Apr 29 '18 at 19:24
  • please include your library call, `between` can be either from `dplyr` or `data.table` – moodymudskipper Apr 29 '18 at 19:38

3 Answers3

4

You can use cut:

df$rank <- cut(x = df$score,c(0,1.2,2.1,2.9,3.7,4.5,5.4,Inf),FALSE)
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
1
library(dplyr)

set.seed(1234)
df <- data.frame(rank  = rep(0, 15),
                 score = runif(15, 0, 6))
df

#>    rank      score
#> 1     0 0.68222047
#> 2     0 3.73379643
#> 3     0 3.65564840
#> 4     0 3.74027665
#> 5     0 5.16549230
#> 6     0 3.84186363
#> 7     0 0.05697454
#> 8     0 1.39530304
#> 9     0 3.99650255
#> 10    0 3.08550685
#> 11    0 4.16154775
#> 12    0 3.26984901
#> 13    0 1.69640150
#> 14    0 5.54060091
#> 15    0 1.75389504

df %>% 
  mutate(rank = case_when(between(score,   0, 1.2) ~ 1,
                          between(score, 1.2, 2.1) ~ 2,
                          between(score, 2.1, 2.9) ~ 3,
                          between(score, 2.9, 3.7) ~ 4,
                          between(score, 3.7, 4.5) ~ 5,
                          between(score, 4.5, 5.4) ~ 6))
#>    rank      score
#> 1     1 0.68222047
#> 2     5 3.73379643
#> 3     4 3.65564840
#> 4     5 3.74027665
#> 5     6 5.16549230
#> 6     5 3.84186363
#> 7     1 0.05697454
#> 8     2 1.39530304
#> 9     5 3.99650255
#> 10    4 3.08550685
#> 11    5 4.16154775
#> 12    4 3.26984901
#> 13    2 1.69640150
#> 14   NA 5.54060091
#> 15    2 1.75389504

Created on 2018-04-29 by the reprex package (v0.2.0).

Tung
  • 26,371
  • 7
  • 91
  • 115
0

As you didn't add a reproducible example, I created a little one (but keep in mind you should always add an example).

Using ifelse from base you could do this way:

df = data.table(rank = c(1.2, 3.3, 2.5, 3.7, 5.8, 6, 3, 1.1, 0.5))
df$rank2 = ifelse(df$rank>0 & df$rank<=1.2, 1, 
             ifelse(df$rank>1.2 & df$rank<=2.1, 2, 
                    ifelse(df$rank>2.1 & df$rank<=2.9, 3, 
                           ifelse(df$rank>2.9 & df$rank<=3.7, 4, 
                                  ifelse(df$rank>3.7 & df$rank<=4.5, 5, 6)))))

The last ifelse should be your maximun rank value, so the "no" argument will be the last range.

If this is a reocurring problem you should create a function.

Hope it helps.

Giovana Stein
  • 451
  • 3
  • 13
  • thanks, would you be able to recommend a good website that introduces functions? I tried googling it, but most websites do not do a good job in explaining it. – Lonewolf Apr 30 '18 at 00:05
  • You could use datacamp, they have plenty of courses to teach almost everything in r. – Giovana Stein May 01 '18 at 12:55