0

I have a data frame with about 40 columns 2000 rows. One of them has numbers and letters like this :

CLN

5T
14S
1A
12N

At first I used the melt in order to get a long structure of my data. I want to plot my data with the CLN at the yaxis but with the order of numbers like : 1A,5T,12N,14S. I know that I have to use the sort but how can I do this for my numbers and letters? (I am sorry but my df is huge and I cannot provide a productive example).

g.f.l
  • 39
  • 8
  • Your post says `how can I do this for my numbers only and not the letters?` Can you clarify? – akrun Aug 20 '16 at 17:42
  • Yes I am sorry. I want to do the sorting based on the numbers but I don't want to remove the letters because for my plot I want the order to be like: 1A,1T,1N,1S...3A,3T,3N,3S. It is a huge df so there are many combinations between letters and numbers and I need the sorting for both but first based on numbers. – g.f.l Aug 20 '16 at 17:52
  • Okay, my solution is not removing any letters, but it order based on the numbers leaving the letters untouched – akrun Aug 20 '16 at 17:53
  • Yes and it works for the numbers(thank you) but still I have to do this for the letters too.. – g.f.l Aug 20 '16 at 17:58
  • 1
    Okay, then your post is misleading i.e. `how can I do this for my numbers only and not the letters?` – akrun Aug 20 '16 at 17:58
  • 1
    Oh boy, here come the head games. – Rich Scriven Aug 20 '16 at 18:04

2 Answers2

4

Try mixedsort from gtools:

vec <- c("5T", "14S", "1A", "12N")

gtools::mixedsort(vec)
# [1] "1A"  "5T"  "12N" "14S"
mtoto
  • 23,919
  • 4
  • 58
  • 71
1

If we need to sort by numbers, one way is to remove the non-numeric part using sub, convert to numeric and order the column. The OP's post says

how can I do this for my numbers only and not the letters?

v1 <- as.numeric(sub("\\D+", "", df1$CLN))
df1$CLN <- df1$CLN[order(v1)]
df1$CLN
#[1] "1A"  "5T"  "12N" "14S"

If we need to do this for both letters and numbers

v2 <- sub("\\d+", "", df1$CLN)
df1$CLN <- df1$CLN[order(v1, v2)]
 

Then, we change it to factor with levels specified as the unique elements of 'CLN' for using that order in the plot.

df1$CLN <- factor(df1$CLN, levels = unique(df1$CLN))

NOTE: These are base R options and no packages used.

data

df1 <- data.frame(CLN = c("5T", "14S", "1A", "12N"), stringsAsFactors=FALSE)
Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662