0

I am not that good at coding in r and I need help for a stats class project. I need to create a new categorical value degOB in r that relates back to a value POBAD from my database dd.

degOB = 0 for POBAD <= 30 , 1 for 30 < POBAD <= 33, 2 for 33 < POBAD I must use 'as.factor' to do so but I am not sure how to set this up

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • 2
    `?cut` is the best option to use in this case – Jaap Apr 14 '20 at 20:23
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Apr 14 '20 at 20:27

3 Answers3

0

The simple way to do this is with cut as @Jaap mentioned. First we need to create some data the is similar to yours:

set.seed(42)
POBAD <- sample(25:40, 25, replace=TRUE)
dd <- data.frame(POBAD)

Now we add the new variable:

dd$degOB <- cut(dd$POBAD, breaks=c(0, 30, 33, max(dd$POBAD)))
levels(dd$degOB) <- 0:2
str(dd)
# 'data.frame': 25 obs. of  2 variables:
#  $ POBAD: int  25 29 25 33 34 28 26 34 25 40 ...
#  $ degOB: Factor w/ 3 levels "0","1","2": 1 1 1 2 3 1 1 3 1 3 ...

That is the easy way to do it. Using as.factor just makes it more complicated, but if you want to do that, use this statement instead of the one using cut.

dd$degOB <- as.factor(ifelse(dd$POBAD <= 30, 0, ifelse(dd$POBAD > 30 & dd$POBAD <= 33, 1, 2)))
dcarlson
  • 10,936
  • 2
  • 15
  • 18
0

To help you, please post your code and what you have tried so far, rather than looking like us doing your homework for you :)

Setup a factor function and feed it the vectors like here

# Create Ordinal categorical vector 
degree_vector <- c('degOB', 'POBAD', ...)
# Convert `degree_vector` to a factor with ordered level
factor_degree <- factor(degree_vector, order = TRUE, levels =c('degOB', 'POBAD', '', ))
# Print the new variable
factor_degree 

Option 2: Much simpler to understand, this is what I do

# Step 1 setup your data frame
d <- data.frame(variable = c("degOB", "POBAD", "", ""))

# Step 2 your factor
d$variable.r <- as.integer(as.factor(d$variable))

# Step 3 add a mapping to your degrees fill out the rest
mapping <- c("degOB" = 0, "POBAD"  <= 30, ...)
d$variable.r <- mapping[d$variable]
Transformer
  • 6,963
  • 2
  • 26
  • 52
0

This is what I tried: dd$degOB = as.factor(ifelse(dd$POBAD<=30, 0, ifelse(30 < dd$POBAD & dd$POBAD <= 33, 1, ifelse(dd$POBAD > 33, 2, NA))))

and I believe I got the correct answer. Thanks for your help!

  • The database is already established with other variables and was obtained by getcsv = function(x) read.csv(file=paste("http://www.umich.edu/~dnoll/BME503/",x,sep=""), header=T) dd = getcsv("diabetes_dat.csv") attach(dd) – Taya Monae Apr 14 '20 at 21:27