0

I have a vector of temperature values as

temp <- c(2.6, 5.3, 4.6,9.8,9.4,14.1,16.2,16.4,11.6,8.0,3.0,5.0)

Im trying to create a factor to this vector with 3 levels defined as: below 5 (l), between 5 and 15 (m), and above 15 (h). Help appreciated.

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
swed1983
  • 85
  • 6

3 Answers3

3

I would suggest using cut() function, then you can store results in a dataframe:

#Data
temp <- c(2.6, 5.3, 4.6,9.8,9.4,14.1,16.2,16.4,11.6,8.0,3.0,5.0)
#Cut
temp2 <- cut(temp,breaks = c(-Inf,5,15,Inf),labels = c('l','m','h'),include.lowest = T,right = F)
#Dataframe
df <- data.frame(temp,temp2)

Output:

   temp temp2
1   2.6     l
2   5.3     m
3   4.6     l
4   9.8     m
5   9.4     m
6  14.1     m
7  16.2     h
8  16.4     h
9  11.6     m
10  8.0     m
11  3.0     l
12  5.0     m
Duck
  • 39,058
  • 13
  • 42
  • 84
1

A simple base R option

c("l","m","h")[(temp>=5) + (temp>=15)+1]

which gives

[1] "l" "m" "l" "m" "m" "m" "h" "h" "m" "m" "l" "m"
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
0

You cannot do ecsatly what you want to do. Because a factor vector, must contain the values that are defined in levels. In other words, you can not insert in levels, values or labels that do not appear in the vector. Below I put another way that you can create, a factor column for your case.

library(tidyverse)

temp <- c(2.6, 5.3, 4.6,9.8,9.4,14.1,16.2,16.4,11.6,8.0,3.0,5.0)

tab <- data.frame(
  temp = temp
)

tab <- tab %>% 
  mutate(
    case = case_when(
      temp < 5 ~ "Below 5",
      between(temp, 5, 15) ~ "Between 5 and 15",
      temp > 15 ~ "Above 15"
    )
  )

tab$case <- factor(tab$case)
Pedro Faria
  • 707
  • 3
  • 7
  • Why do you say they can't do this exactly? Their vector isn't already a factor; they're trying to create one based on fixed intervals. That's what the base `cut` function does – camille Aug 31 '20 at 18:52
  • Hey camile! I understood, that he wanted to transform his vector of doubles, to a factor vector where the levels where the letters "l", "m", "h", but retaining the doubles values in the vector. In other words, a vector of doubles, with an attribute of levels that contain the letters "l", "m", "h". I probably just got confused. But if is that what he wanted, the R does not allow a factor vector, where the values in the vector are different from the values defined in the levels attribute. – Pedro Faria Aug 31 '20 at 20:10