2

So I have got a data frame with football player's names, nationalities and stats from a football game. I wanted to find best 10 players in each country, sum their "Special" stat, select top 10 countries with the highest sum and then plot it.

 fifka3 <- fifka %>% group_by(Nationality) %>% 
           top_n(n = 10, wt=Special) %>% summarize(Top10 = sum(Special)) %>% top_n(10)

When I plot it with:

ggplot(data=fifka3, aes(x=fct_infreq(Nationality),y=Top10)) +
      geom_bar(stat="identity") +
      mytheme_1() ##just my theme function to save time

the function fct_infreq() doesn't change the order of the factors on the plot and I have no clue why. Is it because I created the df "fifka3" from "fifka" using group_by() and the df "fifka3" still contains other factors like presented below? And what can I do to change the order within the ggplot() function?

str(fifka3)
   Classes ‘tbl_df’, ‘tbl’ and 'data.frame':    10 obs. of  2 variables:
   $ Nationality: Factor w/ 165 levels "Afghanistan",..: 3 13 19 35 54 59 78 122 127 139
   $ Top10      : int  23883 21409 23788 23008 21691 21581 21530 21595 22696 21483`

A. Suliman
  • 12,923
  • 5
  • 24
  • 37
Dawid
  • 43
  • 6

2 Answers2

2

fct_infreq() didn't work in this case, because you've already summarised your data, and each value of Nationality only appears once (i.e. freq = 1 for every nationality), so it defaults to alphabetical sorting.

If you are looking for solutions within the forcats package, what you want here is fct_reorder():

ggplot(data = fifka3, 
       aes(x = fct_reorder(Nationality, Top10, .desc = TRUE),
           y = Top10)) +
  geom_col() # geom_col() is equivalent to geom_bar(stat = "identity), with less typing

plot

For the record, expecting others to download data from a link is generally a surefire way to reduce the likelihood of getting assistance. Kaggle is not as bad as links from completely unverified sources, in my opinion, but then again, I had to log in before I could download anything. Please follow the advice here next time to provide data in an easily usable manner.

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
0

Try to use levels argument in factor to change order of a factor:

fifka3 <- fifka %>% group_by(Nationality) %>% 
  top_n(n = 10, wt=Special) %>% summarize(Top10 = sum(Special)) %>% top_n(10)

fifka3$Nationality<-factor(fifka3$Nationality,levels = fifka3$Nationality[order(fifka3$Top10,decreasing = T)])

library(ggplot2)
ggplot(data=fifka3, aes(x=Nationality,y=Top10)) +
  geom_bar(stat="identity")

The result should be like this

Jim Chen
  • 3,262
  • 3
  • 21
  • 35
  • I have got data from: https://www.kaggle.com/thec03u5/fifa-18-demo-player-dataset/data, CompleteDataset.csv file. I want indeed a plot with the decreasing order, with Algeria right on the left, then Brazil, then Croatia etc. Unfortunately your solution doesn' work, the error is: "Error in `levels<-.factor`(`*tmp*`, value = c("Algeria", "Brazil", "Croatia", : number of levels differs". I guess it is because there are 165 levels for $Nationality and just 10 $Top10 sums. – Dawid Aug 25 '18 at 14:44