0

First, I have read some similar questions. My question is very similar to those which have been already solved. But the slight difference causes some problems for me.

In my question, I have a column of data frame with five different levels of strings: "10-20%" "100+%" "21-40%" "41-70%" "71-100%". I have tried both function, as.numeric and as.integer. These two functions did change the strings into numeric responses. The problem is that I want to convert these strings by following the numerically sequence. For example, "10-20%" "100+%" "21-40%" "41-70%" "71-100%", each of the string is corresponding to the strings is 1,2,3,4,5.

But the thing I want is to "10-20%" is 1, "21-40%" is 2, "41-70%" is 3, "71-100%" is 4 and "100+%" is 5. Do I have to change the sequence of levels of these strings Manually if I want to achieve my goal?

Appendix:

levels(dataset$PercentGrowth)
[1] ""        "10-20%"  "100+%"   "21-40%"  "41-70%"  "71-100%"

head(as.integer(dataset$PercentGrowth))
[1] 1 4 3 1 3 4

head(as.numeric(dataset$PercentGrowth))
[1] 1 4 3 1 3 4

head((dataset$PercentGrowth))
[1]        21-40% 100+%         100+%  21-40%
Levels:  10-20% 100+% 21-40% 41-70% 71-100%
derderstar
  • 45
  • 7

3 Answers3

1
as.numeric(factor(df$string.var, 
    levels = c("10-20%", "21-40%", "41-70%", "71-100%",  "100+%"))
?factor

Sample data would help.

Edited to add levels.

r.bot
  • 5,309
  • 1
  • 34
  • 45
1

You should create a factor from your strings assigns the levels in the good order:

x = c("10-20%", "100+%" ,"21-40%" ,"41-70%", "71-100%")
as.integer(factor(x,levels=x))

[1] 1 2 3 4 5
agstudy
  • 119,832
  • 17
  • 199
  • 261
0

You may try:

x <- c("10-20%", "100+%" ,"21-40%" ,"41-70%", "21-40%", "71-100%", "10-20%")
library(gtools)
match(x,unique(mixedsort(x)))
#[1] 1 5 2 3 2 4 1

##
as.numeric(factor(x, levels=unique(mixedsort(x))))
#[1] 1 5 2 3 2 4 1

Suppose your vector is: (Not a general solution)

x1 <- c("less than one year", "one year", "more than one year","one year", "less than one year")

?gsub2() From R: replace characters using gsub, how to create a function?

gsub2 <- function(pattern, replacement, x, ...) {
for(i in 1:length(pattern))
x <- gsub(pattern[i], replacement[i], x, ...)
x
}

x1[mixedorder(gsub2(c("less","^one","more"), c(0,1,2), x1))]
[1] "less than one year" "less than one year" "one year"          
[4] "one year"           "more than one year"
Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Your code helped me a lot. Especially, for the function mixedsort, it can order the strings with embedded numbers. By the way, do you know any other functions, which can order the strings such as "less than one year", "one year", "more than one year" ? – derderstar Jun 20 '14 at 03:14
  • Hi "user3757135", I am not aware of any functions that does the string sort. One way would be due to manually specify the levels in the ?factor() if you don't have that many unique strings. Other way would be change the alphabetic string to a mixed string and do ?mixedorder. Please check the edit. – akrun Jun 20 '14 at 09:23
  • Hi akrun, thank you for your comments. It helped me! But the later part, I think I will do it manually. Thank you=) – derderstar Jun 20 '14 at 16:22