Is there an R function to sort column variables?

Question

Is there an R function to sort column variables? I have a list of columns like below

Col1    11  110   1100    12
   a    1   2     20      22
   b    16  5      3      18

By default R is sorting 11, 110, 1100, and so on.

But I need,

Col1    11  12    110   1100    
   a    1   22     2    20  
   b    16  18     5    3

Is there a way to do this?

You shouldn't make integers as your column names as they aren't really integers, but rather characters. Hence, the usual sort will give undesired results and also using this column names for data manipulation makes it hard and confusing. — David Arenburg, May 19 '19 at 07:42
To add to @DavidArenburg 's comment, R will almost always add an X to numeric column names. The trouble is that even if you removed this X, the presence of `col1` makes it hard to sort these names. `dplyr`s `select` in combination with `everything` might make it easier. — NelsonGon, May 19 '19 at 07:44
@David, The issue is there are lots of columns with integers. It is not possible to change them characters. Is there an alternate way to deal with this? Cannot we sort the column names? — Dev, May 19 '19 at 07:45
You haven't understand me. Column names are never integers, even if they are printed as such. Hence they will be sorted as they were characters, hence, "110", will come before "12", hence your sort won't make much sense. You could use some helper functions in order to achieves this, such as `df[c("Col1", gtools::mixedsort(names(df)[-1]))]`. But still, it's not a good practice to have such column names. — David Arenburg, May 19 '19 at 07:56

NelsonGon · Answer 1 · 2019-05-19T15:16:15.840

1

If you only have Col1 as non-numeric, you could use:

df[,c("Col1",as.character(sort(as.numeric(names(df)[-1]),decreasing=F)))]
  Col1 11 12 110 1100
1    a  1 22   2   20
2    b 16 18   5    3

Otherwise:

To add to @DavidArenburg 's comment, R will almost always add an X to numeric column names. The trouble is that even if you removed this X, the presence of col1 makes it hard to sort these names. dplyr's select in combination with everything might make it easier as we use below.

df<-read.table(text="Col1    11  110   1100    12
   a    1   2     20      22
   b    16  5      3      18",header=T)

names(df) <- gsub("X","",names(df))

As @akrun points out, we can skip the gsub by setting check.names=FALSE in read.table i.e:

 df<-read.table(text="Col1    11  110   1100    12
       a    1   2     20      22
       b    16  5      3      18",header=T, check.names= FALSE)

Proceeding with dplyr:

 library(dplyr)
    df %>% 
      select(Col1,`11`,`12`,everything())
      Col1 11 12 110 1100
    1    a  1 22   2   20
    2    b 16 18   5    3

edited May 19 '19 at 15:16

answered May 19 '19 at 07:45

NelsonGon

13,015
7
27
57

1

Hi Nelson. I got it thanks. Here it is only 11 and 12. But i gave just a sample of columns. In mu dataset, there are many columns like 11,110,1100.......................12,120,1200.......................13,130,1300 and so on. – Dev May 19 '19 at 07:59
Please try the first part of this answer. – NelsonGon May 19 '19 at 08:00
I tried this actually. But I am getting 110 after 19, instead of 20. But I am getting 11,12,13 and so on. But after 19, its 110. I need 20, 21 and so on – Dev May 19 '19 at 08:06
Could you add your data to the question with `dput(head(df))`? – NelsonGon May 19 '19 at 08:19
I can add. but there are 180 columns – Dev May 19 '19 at 08:28
How did you read this data into R and how did the column names become numeric? I cannot reproduce the issue of 110 coming before 20 for instance. Without a sample of your data it may be impossible to further solve the problem. – NelsonGon May 19 '19 at 08:37
I had another question in this link https://stackoverflow.com/questions/56204808/is-there-an-r-function-to-split-the-sentence/56204989#56204989 There in the code "col = paste0("Col", 1:n()))" is actually giving me Col1, Col11 and so on. Hope this helps – Dev May 19 '19 at 08:48
I don't get it because then this changes the aim of your question. In this question you state that your data has numeric columns yet in that link that is not the case. – NelsonGon May 19 '19 at 09:08
1

Thanks @akrun I will keep this in mind going forward. – NelsonGon May 19 '19 at 15:14

score 1 · Answer 2 · answered May 19 '19 at 07:58

A workaround with base R could be:

df <- read.table(text = "Col1    11  110   1100    12
a    1   2     20      22
b    16  5      3      18", h = T)

colnames(df)[-1] <- gsub("\\D", "", colnames(df)[-1]) #this step is not necessary if your data does not contain X'es in thecolumn names

df[,c(colnames(df)[1],as.character(sort(as.numeric(colnames(df)[-1]))))]

  Col1 11 12 110 1100
1    a  1 22   2   20
2    b 16 18   5    3

Still, I'd recommend to take the concerns of @ David Arenburg and @ NelsonGon into account.

Is there an R function to sort column variables?

2 Answers2