0

I am very new to R and am having some difficulties dividing the numbers in a vector into categories called A (<15), B(15-30), C(30-45) and D(>45) (to eventually run a multivariate regression model)

I'm currently using the if function (although if there is a better way to do this I would also be fine with it), here's the code:

high<- inc_edu_waste$Percentage.high

cathigh<- rep(0, times=408)
for (i in 1:408){
if (high[i] < 15){high[i] <- "A"}
if (high[i]>=15 & high[i]<30){cathigh[i] <- "B"}
if (high[i]>=30 & high[i]<45){cathigh[i] <- "C"}
if (high[i]>=45 & high[i]<100){cathigh[i] <- "D"}
}

When I run this I get the following errors:

Error in if (high[i] < 15) { : missing value where TRUE/FALSE needed

In addition: Warning message: In Ops.factor(high[i], 15) : < not meaningful for factors.

Your help would be very much appreciated!

RvHt
  • 1
  • 1
    consider using `cut` Something like `LETTERS[1:5][cut(high, breaks=c(-Inf, 15, 30, 45, 100, Inf), labels=FALSE)]` – akrun Oct 18 '14 at 15:40
  • 1
    Please provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Rich Scriven Oct 18 '14 at 15:47

1 Answers1

1

As akrun suggested using cut

> v <- 1:99
> cut(v, c(0,14,29,44,99), LETTERS[1:4])
 [1] A A A A A A A A A A A A A A B B B B B B B B B B B B B B B C C C C C C C C C C C C C C C D D D D
[49] D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D D
[97] D D D
Levels: A B C D
DatamineR
  • 10,428
  • 3
  • 25
  • 45