2

After helping me out to sort my data.frame depending on a specific order, I've thought that I could label my plot axes using the the scale_x_discrete()-parameter, where I have defined the same order to fit data and labels. Although the labels are created in the right order, but it seems that ggplot orders the dataset by itself, which means that the bars do not fit with the labels.
As you can see in the screenshots, the bars are visualized in the same order (one ggplot with and one without using scale_x_discrete(limits = orderSort).... Is there any way to supress the internal ordering and to apply the order, which should be new.df$UserEmail ?

# Load packages
library(plyr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(reshape2)

# Load data
RawDataSet <- read.csv("http://pastebin.com/raw/VP6cF31A", sep=";")

# Summarising the data
new.df <- RawDataSet %>% 
  group_by(UserEmail,location,context) %>% 
  tally() %>%
  mutate(n2 = n * c(1,-1)[(location=="NOT_WITHIN")+1L]) %>%
  group_by(UserEmail,location) %>%
  mutate(p = c(1,-1)[(location=="NOT_WITHIN")+1L] * n/sum(n))

# Reorder new.df based on a defined verctor
new.df <- new.df[ order(match(new.df$UserEmail, as.integer(c("28","27","25","23","22","21","20","16","12","10","9","8","5","4","2","1","29","19","17","15","14","13","7","3","30","26","24","18","11","6")) )), ]

# Same vector which is used to sort new.df
orderSort <- c("28","27","25","23","22","21","20","16","12","10","9","8","5","4","2","1","29","19","17","15","14","13","7","3","30","26","24","18","11","6")

ggplot() +
  geom_bar(data = new.df[new.df$location == "NOT_WITHIN",],
           aes(x = UserEmail, y = n2, color = "darkgreen", fill = context),
           size = 1, stat = "identity", width = 0.7) +
  geom_bar(data = new.df[new.df$location == "WITHIN",],
           aes(x = UserEmail, y = n2, color = "darkred", fill = context),
           size = 1, stat = "identity", width = 0.7) +
  # Labels are created in the right order, but geom_bars are not sorted
  # scale_x_discrete(limits = orderSort) +
  scale_y_continuous(breaks = seq(-25,25,5),
                     labels = c(25,20,15,10,5,0,5,10,15,20,25)) +
  scale_color_manual("Location of interaction",
                     values = c("darkgreen","darkred"),
                     labels = c("NOT_WITHIN","WITHIN")) +
  scale_fill_manual("Type of interaction",
                    values = c("lightyellow","lightblue"),
                    labels = c("Clicked A","Clicked B")) +
  guides(color = guide_legend(override.aes = list(color = c("darkred","darkgreen"),
                                                  fill = NA, size = 2), reverse = TRUE),
         fill = guide_legend(override.aes = list(fill = c("lightyellow","lightblue"),
                                                 color = "black", size = 0.5))) +
  coord_flip() +
  theme_grey() +
  theme(
    axis.text.x = element_text(angle = 0, hjust = 1, vjust = 0.5, size = 14),
    axis.title = element_blank(),
    legend.title = element_text(face = "italic", size = 14),
    legend.key.size = unit(1, "lines"),
    legend.text = element_text(size = 11))

Without using using the scale_x_discrete-paramter.
Without Label
By using the scale_x_discrete-paramter. with Label

Community
  • 1
  • 1
schlomm
  • 551
  • 2
  • 11
  • 22
  • Haven't tried anything, but would have expected you to have used `scale_y_discrete` or `scale_x_continuous` on such a plot. – IRTFM Feb 10 '16 at 20:34
  • @42- I think the scales are used correctly as @schlomm uses `coord_flip` as well. – Jaap Feb 10 '16 at 20:53

1 Answers1

2

UPDATE: The trick is to convert the UserEmail variable to a factor variable:

# converting 'UserEmail' to a factor variable
new.df$UserEmail <- factor(as.character(new.df$UserEmail),
                           levels = unique(new.df$UserEmail))


# and use:
scale_x_discrete(limits = orderSort)

this results in the following plot:

enter image description here


OLD ANSWER: If I understand you correctly, you should define the breaks instead of defining the limits. Using:

scale_x_discrete(breaks = orderSort, limits = sort(unique(new.df2$UserEmail)))
# or:
scale_x_discrete(breaks = orderSort, limits = as.integer(orderSort))

gives:

enter image description here

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • Hey Japp - thanks for your input! Actually I've intended something different. The bars correspond to the right persons - that good :D But what I've tried to ask was how can I manage it to use the y-axis like in my second picture, where the bars are correspond to their ids. For example: 28 (as the label and of course the corresponding bars should be at the bottom of y-axis, above this 27, 25, 23, 22...like it's defined by my orderSort-vector. – schlomm Feb 10 '16 at 20:55
  • @schlomm See the update. HTH – Jaap Feb 10 '16 at 21:36
  • Great! Thanks for your time and support :) – schlomm Feb 10 '16 at 23:21
  • @schlomm Glad I could help. Also nice to see you are using the code from one of my previous answers :-). Now the question became more clear, I marked it as a duplicate because this problem has been asked before. Hope you don't mind. – Jaap Feb 11 '16 at 07:52