1

I have a dataframe like this

Day <- c("Day1","Day20","Day5","Day10")
A <- c (5,7,2,0)
B <- c(15,12,16,30)

df <- data.frame(Day,A,B)

df$Day <- as.character(df$Day)

The first column is a character and hence I used this solution to sort this dataframe but not quite getting it right since this only sorts the first column and leaves the column 2 & 3 unchanged.

df$Day <- df$Day[order(nchar(df$Day), df$Day)]

My desired output is

 Day A  B
Day1 5 15
Day5 2 16
Day10 0 30
Day20 7 12

What am I missing here? Kindly provide some inputs.

Community
  • 1
  • 1
Sharath
  • 2,225
  • 3
  • 24
  • 37
  • Try this instead: `df <- df[order(nchar(df$Day)), ]`. Or simply `df <- df[order(df$Day), ]` if you don't want to sort by length. This does string sorting. If you want a different ordering, you are better off sorting by the numeric component of that column. – Gopala Feb 01 '16 at 00:27
  • Not right. Day20 comes before Day10. – Sharath Feb 01 '16 at 00:30
  • Like I said, you can't use strings to order and expect numeric ordering. – Gopala Feb 01 '16 at 00:30

3 Answers3

3

You can try using something like this that does numeric day sorting:

Day <- c("Day1","Day20","Day5","Day10")
A <- c (5,7,2,0)
B <- c(15,12,16,30)
df <- data.frame(Day,A,B, stringsAsFactors = FALSE)

df$DayNum <- as.numeric(gsub('Day', '', df$Day))
df <- df[order(df$DayNum), ]

Output as follows:

df
    Day A  B DayNum
1  Day1 5 15      1
3  Day5 2 16      5
4 Day10 0 30     10
2 Day20 7 12     20

You can avoid creating a new column by doing the following (was trying to show full detail of what was going on):

df <- df[order(as.numeric(substr(df$Day, 4, nchar(df$Day)))), ]

Output will be same as above.

Gopala
  • 10,363
  • 7
  • 45
  • 77
  • Yes. Perfect. I was trying to work out after you told me. Thanks for posting this. So elegantly done. I just applied it to a bigger dataset that I have and its fast. – Sharath Feb 01 '16 at 00:38
  • I updated with a line that avoids adding a column. Hope that helps. – Gopala Feb 01 '16 at 00:40
1

This could be done with mixedorder from library(gtools)

 library(gtools)
 df[mixedorder(df$Day),]
 #    Day A  B
 #1  Day1 5 15
 #3  Day5 2 16
 #4 Day10 0 30
 #2 Day20 7 12
akrun
  • 874,273
  • 37
  • 540
  • 662
0
Day <- c("Day1","Day20","Day5","Day10")
A <- c (5,7,2,0)
B <- c(15,12,16,30)
df <- data.frame(Day,A,B, stringsAsFactors = FALSE)

# add leading zero(s) to digits in values of Day column, 
# e.g., "Day5" --> "Day05"
# then return the indices of the sorted vector
indices_to_sort_by <- sort(
    sub(
        pattern = "([a-z]{1})([1-9]{1}$)", 
        replacement = "\\10\\2", 
        x = df$Day
    ), 
    index.return = TRUE)$ix 

df[indices_to_sort_by, ]
#     Day A  B
# 1  Day1 5 15
# 3  Day5 2 16
# 4 Day10 0 30
# 2 Day20 7 12
Jubbles
  • 4,450
  • 8
  • 35
  • 47