3

First time asking here so forgive me if I'm not clear enough.

So far I have seen many replies to similar questions, which explain how to sort bars by some field of a data frame; but I've been not able to find how to sort them by the default stat "count" of geom_bar (which is obviously NOT a field of the data frame.) For example, I run this code:

library(ggplot2)

Name <- c( 'Juan','Michael','Andrea','Charles','Jonás','Juan','Donata','Flavia' )
City <- c('Madrid','New York','Madrid','Liverpool','Madrid','Buenos Aires','Rome','Liverpool')
City.Id <- c(1,2,1,3,1,4,5,3)
df = data.frame( Name,City,City.Id )

a <- ggplot( df,aes( x = City, text=paste("City.Id=",City.Id)) ) +
geom_bar()

ggplotly(a)

And then I would like to visualize the resulting bars ordered by their height (=count.) Note that I must keep the "City.Id" info to show in the final plot. How can this be done?

Javi
  • 159
  • 1
  • 8

2 Answers2

6

Given that you're already using ggplot2, I'd suggest looking into what else the tidyverse can offer. Namely the forcats package for working with factors.

forcats has a nice function fct_infreq() which will (re)set the levels of a factor to be in the order of their frequency. If the data is a character vector not already a factor (like City is in your data) then it will first make it a factor, and then set the levels to be in frequency order.

Try this code:

# Load packages
library(ggplot2)
library(forcats)

# Create data
Name <- c( 'Juan','Michael','Andrea','Charles','Jonás','Juan','Donata','Flavia' )
City <- c('Madrid','New York','Madrid','Liverpool','Madrid','Buenos Aires','Rome','Liverpool')
City.Id <- c(1,2,1,3,1,4,5,3)
df = data.frame( Name,City,City.Id )

# Create plot
a <- ggplot(df, aes(x = fct_infreq(City), text=paste("City.Id=",City.Id)) ) +
  geom_bar()

a
Jim Leach
  • 449
  • 5
  • 7
5

One could use reorder :

df$City <- reorder(df$City,df$City.Id,length)

and then plot with the code in the question.

enter image description here

thisisrg
  • 596
  • 3
  • 12