0

I've read a related post (Simpler population pyramid in ggplot2), but I have a slightly different setup which results in a messed-up pyramid.

Make the test data frame:

test <- data.frame(cbind(c(replicate(3,"population 1"), replicate(3,"population 2")),c("top","middle","bottom","top","middle","bottom"),c(70,25,5,82,13,3)))

Fix the factor ordering:

levels(test$X3)
[1] "13" "25" "3"  "5"  "70" "82"

test$X3 <- factor(test$X3, levels=c(70,25,5,82,13,3))

levels(test$X2)
[1] "Bottom" "Middle" "Top" 

test$X2 <- factor(test$X2, levels=c("Top","Middle","Bottom"))

Try

library(ggplot2)
ggplot(data = test,  aes(x=X3, y=X2)) +
  geom_bar(data = subset(test, X1=="population 1") , stat = "identity")+
  coord_flip()

But it's wrong, and I can't figure out why. The top/middle/bottom factors are in inverse order:

Top, 70% shows as the smallest bar; Middle shows as middle; Bottom, 5% shows as the largest bar

Ultimately I want to make the following:

funnel with population 1 and population 2

EDIT - I fixed the one-sided block by imposing the factor re-order in the opposite direction explicitly (below) but I still do not understand why ggplot won't recognize how to plot the data, so any explanation is welcome.

# THIS PLOTS ONE SIDE OF THE PYRAMID CORRECTLY
testdf <- data.frame(cbind(c(replicate(3,"population 1"), replicate(3,"population 2")),c("Top","Middle","Bottom","Top","Middle","Bottom"),c(70,25,5,82,13,3)))
testdf$X3 <- factor(testdf$X3, levels=c(5,25,70,3,13,82))
testdf$X2 <- factor(testdf$X2, levels=c("Bottom","Middle","Top"))
g <- ggplot(data = testdf,  aes(x=X3, y=X2))
g <- g + geom_bar(data = subset(testdf, X1=="population 1") , stat = "identity")
g + coord_flip()
Uwe
  • 41,420
  • 11
  • 90
  • 134
  • just checking, .., you want population 1 on each side? (so replicating the values side-by-side making it symmetrical?) , or is it a typo and you want population 1 on one side and population 2 on the other side of the pyramid? – elikesprogramming Nov 07 '18 at 16:56
  • Ultimately, I want population 1 on one side and population 2 on the other side. – user3100205 Nov 07 '18 at 17:04
  • ok, then you should perhaps avoid using a subset-approach and rather condition on the population to set the sign of the value. You also should not be converting the values to a factor. And you are using a reverse order than you want in reordering X2 variable. (note also that your code did not properly work also because in the data you used lowercase for Top, Bottom, and Middle, and capital case for the first character when you reorder the factor). See the answer below if it helps. But probably this will soon be marked as duplicated. Good luck – elikesprogramming Nov 07 '18 at 17:25

3 Answers3

4

This should get you started

test <- data.frame(
    X1 = c(replicate(3, "population 1"), replicate(3, "population 2")),
    X2 = c("top", "middle", "bottom", "top", "middle", "bottom"),
    X3 = c(70, 25, 5, 82, 13, 3)
)

test$X2 <- factor(test$X2, levels = c("bottom", "middle", "top"))

ggplot(data = test,  
       aes(x = X2, y = ifelse(X1 == "population 1", -X3, X3), fill = X1)) +
  geom_bar(stat = "identity") +
  coord_flip()

enter image description here

elikesprogramming
  • 2,506
  • 2
  • 19
  • 37
1

This is working for me:

test <-
  data.frame(
    X1 = c(replicate(3, "population 1"), replicate(3, "population 2")),
    X2 = c("top", "middle", "bottom", "top", "middle", "bottom"),
    X3 = c(70, 25, 5, 82, 13, 3)
  )
test$X3 <- with(test, ifelse(X1 == "population 1", -X3, X3))

library(ggplot2)
ggplot(data = test,  aes(x = X2, y = X3, fill = X1)) +
  geom_col() +
  coord_flip() +
  scale_y_continuous(labels = abs)

enter image description here

Uwe
  • 41,420
  • 11
  • 90
  • 134
0

Posting this as an answer since it is a solution, after using the help above and pointers from https://rpubs.com/walkerke/pyramids_ggplot2:

Make the dataframe, testdf. Keep the response, testdf$percent, as numeric not factor:

testdf <- data.frame(population = c(replicate(3,"population 1"), replicate(3,"population 2")), 
                     layer =  c("Top","Middle","Bottom","Top","Middle","Bottom"), 
                     layernum = as.numeric(c(3,2,1,3,2,1)),
                     percent = as.numeric(c(70,25,5,82,13,3)))
testdf$percent <- ifelse(testdf$population == "population 1", -testdf$percent, testdf$percent)

Use ggplot2

library(ggplot2)

Make the plot:

g <- ggplot(data = testdf,  aes(x=layer, y=percent, fill=population))
g <- g + geom_bar(data = subset(testdf, population=="population 1") , stat = "identity")
g <- g + geom_bar(data = subset(testdf, population=="population 2") , stat = "identity")

g <- g + scale_y_continuous(breaks = seq(-100, 100, 25), 
                     labels = paste0(as.character(c(seq(100, 0, -25), seq(25, 100, 25))), "m"))
g+coord_flip()