4

I have a data frame which stores a count value for each model. Model name is an alphanumeric. Then I generate a bar plot using ggplot2 having the models in the x axis and the count in the y axis. I want to order my x axis. The x axis appears as follows in the data frame and in the x axis in the plot. I want to sort it properly for example, M_1, M_2, M_3, M_10, M_11, M_20 etc

Model   Count
M_1 73
M_10    71
M_100   65
M_11    65
M_110   64
M_111   71
M_13    70
M_130   73
M_2 72
M_20    69
M_200   63
M_21    72
M_210   72
M_211   67
M_3 78
M_30    76
M_300   59
M_31    73
M_310   64

I tried using order(), mixedsort(), arrange() to order the dataframe first and factor() in ggplot2. However was not successful.

geneDFColSum[with(geneDFColSum, order(geneDFColSum$Model)), ]

geneDFColSum[with(geneDFColSum, mixedsort(geneDFColSum$Model)), ]

library(dplyr)
  arrange(geneDFColSum, Model)

Is there a way to achieve this? I could separate the model number into a separate column and order by that column. However looking whether there is an easy way.

neilfws
  • 32,751
  • 5
  • 50
  • 63
SriniShine
  • 1,089
  • 5
  • 26
  • 46
  • 2
    The order of your rows data has no bearing on the order of the plot, only the order of the factor levels. If you can get the right order into some variable, `my_order`, then do `geneDFColSum$Model = factor(geneDFColSum$Model, levels = unique(geneDFColSum$Model))` to set the level order in the data order. – Gregor Thomas Apr 10 '18 at 22:49

2 Answers2

4

You need to order the levels of the factor, not the rows of the data:

dd$Model = factor(dd$Model, levels = gtools::mixedsort(dd$Model))
ggplot(dd, aes(x = Model, y = Count)) + geom_col()

enter image description here


Using this as input data:

dd = read.table(text = "Model   Count
M_1 73
M_10    71
M_100   65
M_11    65
M_110   64
M_111   71
M_13    70
M_130   73
M_2 72
M_20    69
M_200   63
M_21    72
M_210   72
M_211   67
M_3 78
M_30    76
M_300   59
M_31    73
M_310   64", header = T, stringsAsFactors = FALSE)
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • I did the exact same thing earlier and it did not work out. However I found out what was the error. When I was creating the data frame I did not mention "stringsAsFactors = FALSE". I created the data frame again with this option and now it works. Thank you. – SriniShine Apr 10 '18 at 23:12
  • Yeah, mixedsort doesn't work on factors how you'd like. – Gregor Thomas Apr 10 '18 at 23:23
2

Here's a solution based on your idea "separate the model number into a separate column and order by that column". You can then use that to reorder the factor levels.

library(tidyverse)

geneDFColSum %>% 
  mutate(Order = as.numeric(gsub("M_", "", Model))) %>% 
  arrange(Order) %>% 
  mutate(Model = factor(Model, levels = Model)) %>%
  ggplot(aes(Model, Count)) + 
    geom_col()

enter image description here

neilfws
  • 32,751
  • 5
  • 50
  • 63