0

I have tried many solutions I found in the internet but none of them worked for me sadly. I want to sort my Y axis because for osme reason it randomly picks which value to show first. Here is my code:

ggplot(Dane,aes(Dane$State, Dane$`Unsheltered persons (% Homeless population)`))+
       geom_point(aes(Dane$State, Dane$`Unsheltered persons (% Homeless population)`),, color="red")+
       theme(text = element_text(size=15),axis.text.x = element_text(angle=90, hjust=1))+
       labs(title = "Wykres przedstawiajacy jaka czesc populacji nie ma schronienia",
            x="Stan", y="Brak schronienia")

As you can see Y axis doesn't go from lowest ot highest % value:

As you can see Y axis doesn't go from lowest ot highest % value

Here is how data looks like:

Here is how data looks like

State `Total Homeless~ `Rate of Homele~ `Chronic indivi~ `Chronic Person~ `Chronic Homele~ `Persons in fam~ `Unaccompanied ~
   <chr>            <dbl>            <dbl> <chr>            <chr>            <chr>            <chr>            <chr>           
 1 Alab~             4689              9.7 16.4%            1.9%             18.3%            27.8%            8.4%            
 2 Alas~             1946             26.5 8.5%             0.9%             9.5%             30.0%            8.6%            
 3 Ariz~            10562             15.9 10.1%            1.2%             11.2%            38.4%            6.4%            
 4 Arka~             3812             12.9 14.8%            1.0%             15.8%            16.7%            7.6%            
 5 Cali~           136826             35.7 25.9%            2.8%             28.7%            18.3%            11.3%           
 6 Colo~             9754             18.5 13.9%            4.4%             18.2%            52.2%            5.2%            
 7 Conn~             4448             12.4 19.6%            3.9%             23.5%            30.3%            5.3%            
 8 Dela~              946             10.2 6.9%             0.6%             7.5%             39.2%            3.7%            
 9 Dist~             6865            106.  25.7%            3.8%             29.5%            46.2%            2.4%            
10 Flor~            47862             24.5 16.3%            3.9%             20.2%            34.5%            7.2%            
Sandeep Kumar
  • 2,397
  • 5
  • 30
  • 37
  • 1
    Please provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Marmite Bomber Dec 26 '19 at 22:43
  • 1
    Please spend some effort improving the question. Your code includes `labs(..., y="Brak schronienia")`, but your plot shows "Calkowita...". And the data sample doesn't even include the sample data. Further, your sample data (now text, thanks) is not something we can use due to the formatting. Please use `dput(head(x,n=10))` or similar to provide an unambiguous data sample. – r2evans Dec 26 '19 at 23:00
  • 1
    BTW: since you provide `data=Dane`, you should not be using `Dane$` in the `aes(...)` portions. The only time you should be doing that is when you don't define `data=` and/or you over-ride it in specific layers. Also, if the aesthetics you use in `geom_point` do not override your global aesthetics, you don't need to define them again. Lastly, is there a reason you have an empty argument (`,,` in `geom_point`)? – r2evans Dec 26 '19 at 23:02

2 Answers2

1

The variables that you are passing to ggplot (more specifically geom_point) look to be character vectors. Internally R is converting the character strings into factors before plotting them and the default order of the levels of factors is lexical (the order that you are seeing in the plots).

There is some variety in how different programs deal with ordering in plots. Older programs (from before rich data structures) would consider the ordering to be a property of the plot, so you would specify any ordering as an option to the plot. R has richer data structures and sees ordering as a property of the data rather than the plot (you can specify it once and have it be consistent in all plots, tables, etc. instead of having to repeat the ordering over and over). This means that the best way to get the ordering you want is to modify your data (data frame or tibble) to have the variable(s) of interest be factors with the ordering that you want, then call ggplot on the modified data.

There are a few ways to do this. Since you are using ggplot2, you probably will not mind using other tidyverse packages. A simple approach is to use the str_sort function from the stringr package:

library(stringr)
Dane$`Unsheltered persons (% Homeless population)` <- factor(Dane$`Unsheltered persons (% Homeless population)`, 
levels=str_sort(unique(Dane$`Unsheltered persons (% Homeless population)`), numeric=TRUE))

There are other ways using relevel or mutate from the dplyr package, or others.

Note that it is better to use data=Dane in the call to ggplot rather than specifying Dane$ before each variable

Greg Snow
  • 48,497
  • 6
  • 83
  • 110
1

You could also try this before your ggplot:

Dane$`Unsheltered persons (% Homeless population)` <- as.numeric(strsplit(Dane$`Unsheltered persons (% Homeless population)`, "%") 
Zhiqiang Wang
  • 6,206
  • 2
  • 13
  • 27
  • I think that you need one more closing parenthesis. Also, `strsplit` will probably return a list, so you will probably need a call to `unlist` in there somewhere as well as possibly needing to drop elements that are empty or just whitespace. – Greg Snow Dec 30 '19 at 18:09