0

I have a df with multiple categories. My variable of interest is Maximo and I want to know when it occurs (Pasaje). The code I use is this one:

ggplot(df,aes(Pasaje))+
  geom_histogram()+ theme_bw()+
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Which produces

enter image description here

Problem: Pasaje is a character vector that has a "real life order" (i.e, starts in tra1 and goes till tra30, then test1 till test12)

I would like to be able to reorder the x axis.

Option 1: Increasing and/or Decreasing count

Option 2: from tra1 to 30 and test1 to 12

My data frame is big so I can just provide a little subset of it. I believe that it does not add too much to the question but just in case.

z<-df[1:10,]    
dput(z)
    structure(list(Dia = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
    12L, 12L), Mes = c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), Año = c(2015L, 
    2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L
    ), Protocolo = c("2x3", "2x3", "2x3", "2x3", "2x3", "2x3", "2x3", 
    "2x3", "2x3", "2x3"), Animal = c("TR.1", "TR.10", "TR.11", "TR.12", 
    "TR.13", "TR.14", "TR.15", "TR.16", "TR.17", "TR.18"), Entrenamiento = c("VERDADERO", 
    "VERDADERO", "VERDADERO", "VERDADERO", "VERDADERO", "VERDADERO", 
    "VERDADERO", "VERDADERO", "VERDADERO", "VERDADERO"), Maximo = c(219.545, 
    24.273, 18.364, 5.864, 15.182, 142.545, 11.955, 1.455, 36.182, 
    146.182), Pasaje = c("tra2", "tra1", "test1", "tra2", "test1", 
    "tra2", "tra4", "test1", "test1", "tra2")), .Names = c("Dia", 
    "Mes", "Año", "Protocolo", "Animal", "Entrenamiento", "Maximo", 
    "Pasaje"), row.names = c(NA, 10L), class = "data.frame") 
Community
  • 1
  • 1
Matias Andina
  • 4,029
  • 4
  • 26
  • 58
  • The order plotted order is the order of the levels of your factor, `levels(z$Pasaje)`. You can use `reorder()` or edit the `levels()` directly, or specify the order when you convert to factor (if it started as a string). – Gregor Thomas Oct 10 '15 at 00:04
  • In such case I should have to calculate somewhere else the frequencies of each of the 42 levels and then generate an ordered vector with that? Any advice on doing that? – Matias Andina Oct 10 '15 at 00:20
  • 1
    If you want to reorder by frequency (i.e., number of rows in your data, most to least), then `z$Pasaje = reorder(z$Pasaje, X = z$Pasaje, FUN = function(x) -length(x))`. For least to most you can do the same with `FUN = length`. – Gregor Thomas Oct 10 '15 at 00:27
  • 1
    If you *already* have a vector `real_order` with the real life order then `z$Pasaje = factor(z$Pasaje, levels = real_order)`. If you want that real_order to be the order they appear in your data, then `real_order = unique(z$Pasaje)`. – Gregor Thomas Oct 10 '15 at 00:31
  • For option 2, as above with `real_order = c(paste0("tra", 1:30), paste0("test", 1:12))` – Gregor Thomas Oct 10 '15 at 00:32
  • And, just for completeness, if you wanted to order by the mean of the `Maximo` corresponding to each level, `z$Pasaje = reorder(z$Pasaje, X = z$Maximo, FUN = mean)` – Gregor Thomas Oct 10 '15 at 00:34

0 Answers0