0

I have a dataframe k which contains following strings in the same order:

6 to 12 months    
12 to 24 months    
36 to 60 months    
60 to 96 months    
0 to 6 months      
24 to 36 months    
96 to 120 months   
120 months & above.

When I apply the sort command it sorts by the integer values For eg. the string 120 months and above is placed before 6-12 months. Can anyone tell me how to sort it like below using some R command:

0 to 6 months      
6 to 12 months
12 to 24 months    
24 to 36 months
36 to 60 months
60 to 96 months
96 to 120 months   
120 months & above.
Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32
  • `order <- c(4,1,2,6,3,4,7,8)` then reorder using `x2 <- x[order]` – Andrew Gustar Jul 07 '17 at 11:14
  • Are you asking how to sort this particular list, or any list with similar values? Passing a custom sorting (ordering) function is not obvious in R. – anotherfred Jul 07 '17 at 11:18
  • This post does something similar https://stat.ethz.ch/pipermail/r-help/2011-June/280285.html the key point is that you will probably do a string split to get the first number then convert it to numeric so it can be sorted as a number. – anotherfred Jul 07 '17 at 11:21
  • @anotherfred I am asking about sorting any list with similar values? – Praveen Singh Jul 07 '17 at 11:21
  • 2
    `gtools` has a very nice function for that. Try `gtools::mixedsort(df$string)` – Sotos Jul 07 '17 at 11:40
  • @Sotos thank you for letting me know about that function! It would be a simpler choice for OP, although it sounds like they wouldn't benefit from such a 'black box' – anotherfred Jul 07 '17 at 12:29
  • @anotherfred I disagree. If the OP's goal is to make that ordering fast and effortless (which is usually the case), then I 'd say that this function is spot on! – Sotos Jul 07 '17 at 12:32
  • @Sotos indeed, but looking at their comments, their R knowledge, at least, is benefitting – anotherfred Jul 07 '17 at 16:08

2 Answers2

0

When the custom sort does not work, you need to manually specify the order. Create a factor with your specified order, and sort it. Works, even if the array does not contain the group you mention in order.

col1 = c("6 to 12 months", 
        "12 to 24 months", 
        "36 to 60 months", 
        "60 to 96 months", 
        "0 to 6 months", 
        "24 to 36 months", 
        "96 to 120 months", 
        "120 months & above")

order <- c("0 to 6 months", 
         "6 to 12 months", 
         "12 to 24 months", 
         "24 to 36 months", 
         "36 to 60 months", 
         "60 to 96 months",
         "96 to 120 months",
         "120 months & above")
col2 <- factor(col1, levels = order)
sort(col2)

Just make sure that the order contains all the possible values in your array. This example looks very trivial because the input vector is unique. If the input is of length 100, with these 8 possible values then it makes sense.

When there are many more such partitions, use this

names(sort(sapply(col1, function(x)
  as.integer(stringr::str_split(x, pattern = ' ')[[1]][1]))))

Both the methods give the same output. I would prefer the first method because it is much less prone to mistakes.

0

What about this:

require(stringr)

k <- c("6 to 12 months",    
       "12 to 24 months",    
       "36 to 60 months",   
       "60 to 96 months",  
       "0 to 6 months", 
       "24 to 36 months",
       "96 to 120 months",
       "120 months & above")

index <- sapply(seq_along(k), function(x) stringr::str_split(k, pattern = "\\s")[[x]][1])

df <- data.frame(k, index = as.numeric(index))

df[order(index), ]

I think you can scale this to a category of any size (as long as the first part of the string follows some order).

JdeMello
  • 1,708
  • 15
  • 23
  • Can you please explain this function sapply(seq_along(k), function(x) stringr::str_split(k, pattern = "\\s")[[x]][1]) and what it does by passing these arguments? – Praveen Singh Jul 07 '17 at 12:15
  • To find out more about any function, enter a question mark followed by the function name at the R console, e.g. `?sapply`. Here is an excellent guide to the apply functions to get you started https://stackoverflow.com/a/7141669/5316882 . The `stringr:str_split` part splits up a character string. – anotherfred Jul 07 '17 at 12:21
  • Hi Praveen, `stringr` is a package from the `tidyverse`, a set of packages that makes your life much easier in many areas (including data wrangling!). `sapply` is part of the apply family in base R. It enables you to apply a funciton over a vector or list. `lapply`will return you a list whereas `sapply` will return you a vector. The function `stringr::str_split()` splits each element of the vector `k` into parts according to some pattern. In this case, `pattern = "\\s"`, I am splitting each element of the vector `k` by one space (`\\s`). The number `[[1]]` is the first element of the split – JdeMello Jul 07 '17 at 12:23
  • I would also follow @anotherfred 's advice. – JdeMello Jul 07 '17 at 12:26