0

I guess I am trying to create a geom_line version of a geom_bar. The reason I want to do lines is because when I enter

geom_bar(aes(fill = Decile), position = position_dodge())

I am stuck with ten segments and my bar chart looks extremely cluttered.
I have 11 separate x variables going across the bottom.
The problem is I dont know how to use the count as a "y" variable and have tried things like ..count.. and other approached but am completely lost. Any ideas?


Thanks for the help!

My data looks like this:

Name Decile Division
Joe 1 San Diego
Jan 1 New York
Jay 2 San Diego
Lue 3 Dallas
Suz 2 Seattle
tye 3 Dallas

MCD <- read.csv("Decile15.csv", header = TRUE)

MCD$MonthNo <- factor(MCD$MonthNo, levels = c(1:11), labels = c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November"))

Decile_names <- c('1' = "Decile #1",
                '2' = "Decile #2",
                '3' = "Decile #3",
                '4' = "Decile #4",
                '5' = "Decile #5"
                 )





MCDGraph <- ggplot(na.omit(MCD), aes(MonthNo))

MCDGraph + geom_bar(aes(fill = Division), color = "black", position = "fill") +     facet_wrap(~Decile, nrow = 1, labeller = labeller(Decile = as_labeller(Decile_names))) + theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5)) +     scale_fill_manual(values = c("#DA4424", "#24A0DA", "#F0BC0B", "#43F749", "#4348F7", "#F74369", "#D7B9F5")) + theme(panel.background = element_rect(colour = "black", fill = "white"), panel.grid.minor = element_line(color = "black", size = .5)) + labs(x = "2017", y = "% of Leads Per Month By Division") + scale_y_continuous(labels = percent_format()) + theme(strip.background = element_blank(), strip.text = element_text(size = 25))

This is what my % of grouped facet_wrap bar chart looked like.

I split those into 5 so that I could see five plots instead of 10 on one pdf sheet. Also this is only divided into the 7 total divisions. I want to do one that is divided into 10. Here is an example I used to make my grouped regular without faceting anything.

MC <- read.csv("2017_Full_year.csv", header = TRUE)

MC$MonthNo <- factor(MC$MonthNo, levels = c(1:11), labels = c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November"))



MCH <- ggplot(na.omit(MC), aes(MonthNo))

MCH + geom_bar(aes(fill = Division), position = position_dodge() ) + labs(x = "2017", y = "# of Leads") + theme(axis.line = element_line(color = "black", size = 3, linetype = "solid"), axis.text.x = element_text(face = "bold", color = "black", size = 14),
  axis.text.y = element_text(face = "bold", color = "black", size = 14)) + scale_y_continuous(name = "# Of Leads", breaks = seq(0,1000, 50)) + theme(panel.background = element_rect(colour = "black",
  fill = "white"), panel.grid.minor = element_line(color = "black", size = .5), panel.grid.major = element_line(color = "black", size = .5)) +
  scale_fill_manual(values = c("#DA4424", "#24A0DA", "#F0BC0B", "#43F749", "#4348F7", "#F74369","#D7B9F5"))

EDIT: Or should I just create a new csv with the the final counts for each Decile during each month already in it. This would be a quick fix and I can pull the numbers very easily from SQL server. I was just hoping to do this without having to create a new file.

Petey
  • 2,819
  • 1
  • 14
  • 23
  • 1
    Welcome to Stack Overflow! Your question does not contain a [Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) (MCVE). This makes it hard to understand and answer your question. Please share relevant code, a small excerpt of your data and the expected outcome. You can find detailed information on how to create a goo MCVE for R [here](https://stackoverflow.com/q/5963269/4303162). – Stibu Nov 16 '17 at 20:38
  • You can try getting a line version of the geom_bar, would you rather facet it though? Instead of `ggplot(mpg, aes(x = class)) + geom_bar( aes(fill = manufacturer), position = dodge)` You could do `ggplot(mpg, aes(x = manufacturer)) + geom_bar(aes(fill = manufacturer), position = 'dodge', show.legend = F) + facet_wrap(~class, scales = 'free_x')+ theme(axis.text.x = element_text(angle = 45, hjust= 1))` – TBT8 Nov 16 '17 at 21:41
  • Thanks for the advice. I did facet_wrap another one where I had each Decile showing counts with them being grouped by Division. Also did a percent version of this also. I wanted one with 10 lines(for each decile) showing count for each Month for the year so far. Sorry if I'm not explaining this well enough. – Petey Nov 16 '17 at 22:02
  • Could you post a representation of your data? I noticed the one you did up top doesn't include `MonthNo` which is used in your plot. Would something like this make do for a representation of it? `MCD <-tibble(Name = rep(c("Joe", "Jay", "Susan", "Nancy", "Mark"), 2), Decile = sample(c(1,2,3, 4),10, replace = T) , Division = rep(c('San Diego', 'New York', 'Dallas','Seattle','LA'), 2), MonthNo = rep(c(1,2,3,4,5),2))` – TBT8 Nov 16 '17 at 22:14
  • Yeah that would be a good representation. My file has over 4,000 rows but yes, there is a unique key, 1 decile(1-10), 1 division(7 for each city), and 1 monthnumber (1-11 for when the record occured this year). I want to graph a line for each decile with the points of the line representing the quantity. The x variable would be the Month number so 1-11 going across the bottom. Like Decile 2 had 300 occurances during month 3, Decile 5 had 600 during month 3. – Petey Nov 16 '17 at 22:19

2 Answers2

1

Not sure how this would look with your real data, but something you could explore that looks nice in situations like these are ridgeplots using the ggrigdes package. Example is below:

  library(ggridges)
  library(dplyr)
  library(ggplot2)


  set.seed(99)

  MCD <- tibble(
                Name = sample(c("Joe", "Jay", "Susan", "Nancy", "Mark"), 1000, T),
                Decile = sample(1:10,1000, T) ,
                Division = sample(c('San Diego', 'New York', 'Dallas', 'Seattle', 'LA'), 1000, T),
                MonthNo = sample(1:11, 1000 , T))


  MCD$MonthNo <- factor(MCD$MonthNo, levels = c(1:11), labels = c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November"))

  MCD %>% group_by(Decile, MonthNo) %>% summarize(Count = n()) %>% 
       ungroup() %>%  mutate(Decile = factor(Decile)) %>%  
       ggplot(aes(x = MonthNo, y = Decile, height = Count, group = Decile, fill = Decile)) + 
       geom_density_ridges(alpha = 0.7, show.legend = F, stat = "identity", scale = 1)
TBT8
  • 766
  • 1
  • 6
  • 10
0

This should get you the "geom_line version of geom_bar"

df <- data.frame(x=c(1,1,1, 3,3, 10,10,10,10))
ggplot(df, aes(x=x)) + geom_line(stat="count")

UPDATE: Here is the version based on the dataset from TBT8's answer

library(dplyr)
library(ggplot2)

set.seed(99)
#create dummy dataset 
MCD <- tibble(
Name = sample(c("Joe", "Jay", "Susan", "Nancy", "Mark"), 1000, T),
Decile = sample(1:10,1000, T) ,
Division = sample(c('San Diego', 'New York', 'Dallas', 'Seattle', 'LA'), 1000, T),
MonthNo = sample(1:11, 1000 , T))

MCD$MonthNo <- factor(MCD$MonthNo, levels = c(1:11), 
labels = c("January", "February", 
"March", "April", "May", 
"June", "July", "August", 
"September", "October", "November"))

#create a numeric months vector
MonthNum <- 1:11
names(MonthNum) <- levels(MCD$MonthNo)

#create a factor to be used for facets
MCD$DecileF <- factor(
  MCD$Decile, 
  levels=as.character(sort(unique(MCD$Decile),
                           decreasing = TRUE)))

ggplot(MCD, aes(x = MonthNum[MonthNo], col=DecileF)) + 
  geom_path(stat = "count") +
  #geom_point(stat = "count") + 
  facet_grid(DecileF~.) + 
  scale_x_discrete(name ="Month", limits=names(MonthNum))
aivanov
  • 111
  • 3
  • Unfortunately it tells me stat_count requires the following missing aesthetics: x. Maybe my data frame is a lot more complex I will edit my post to describe what I am dealing with. – Petey Nov 16 '17 at 21:40
  • that's why I put "aes(x=x)" . Can you please provide your code? Otherwise it'd be difficult to understand what the problem is... – aivanov Nov 16 '17 at 21:42
  • I posted it above. Your example worked. Let me show what I mean. – Petey Nov 16 '17 at 21:50
  • I cannot run your code, because I do not have the file "2017_Full_year.csv". Can you please upload it somewhere. Or type "dput(MC)" in R console and paste the output if the data.frame is not very big. – aivanov Nov 16 '17 at 22:06
  • My file has over 4,000 rows, there is a unique key, 1 decile(1-10), 1 division(7 for each city), and 1 monthnumber (1-11 for when the record occured this year). I want to graph a line for each decile with the points of the line representing the quantity. The x variable would be the Month number so 1-11 going across the bottom. Like Decile 2 had 300 occurances during month 3, Decile 5 had 600 during month 3. Decile 2 had 100 during month 7 etc.. I just want a different line for each decile.The problem is, I dont have the counts. Although I could at this point just make a new file – Petey Nov 16 '17 at 22:23
  • and have the total counts for each region by decile, but I have been deadset on trying to create it from the original data. Now I'm just curious if it can be done. – Petey Nov 16 '17 at 22:24
  • 4K rows is not much. If you cannot upload your original file, just create a file with some dummy data having the structure similar to your true data and upload this file. – aivanov Nov 16 '17 at 22:44
  • @Petey: should I update my answer and show you what I mean using the data from TBT8's answer? – aivanov Nov 17 '17 at 17:51