-1

My bar graph has a weird Y Axis that skips around seemingly at random, from -1.7% to -10.1%, -10.3%, and then to -2%. You can see it below:

My bar graph with the messed up y axis

Here is my code:

library(ggplot2)

healthd = read.csv("R/states.csv")

states = healthd[[1]]
insuredChange = healthd[[4]]
ggplot(data = healthd, aes(x = states, y = insuredChange)) +
geom_bar(stat="identity") +
theme(axis.text.x=element_text(angle = 90, hjust = 1))

What's going on here? How do I fix it?

Also, how can I get the x axis labels to all be right justified on the same line?

GT.
  • 764
  • 1
  • 8
  • 30
  • 1
    Check the class of your axis Y's values. If it's not numeric, this problem can happen. – Mbr Mbr Apr 03 '17 at 14:19
  • 1
    `insuredChange` is probably a character or possibily a factor, not a `numeric`. `class(healtd$insuredChange)` to check. – GGamba Apr 03 '17 at 14:20
  • Note that the axis breaks aren't random. They're alphabetical. The "%" sign is causing the problem. Do `healthd$insuredChange = as.numeric(as.character(gsub("%","",healthd$insuredChange)))`. Then rerun the plot code. – eipi10 Apr 03 '17 at 14:24
  • It says `> class(healthd$insuredChange)` `[1] "NULL"` – GT. Apr 03 '17 at 14:25
  • @eipi10 after doing that I get: `Error in `$<-.data.frame`(`*tmp*`, "insuredChange", value = numeric(0)) : replacement has 0 rows, data has 52` – GT. Apr 03 '17 at 14:26
  • 1
    Pretty difficult to help more without a fully reproducible version of your data. – joran Apr 03 '17 at 14:50
  • See [how to create a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Seems like you might not have column names in your data.frame? You really should check `class(insuredChange)`. Note that R won't read in columns with percent signs as numeric. This is really a problem with you importing your data and not ggplot. However you haven't shown what the input looks like so it's difficult to help. – MrFlick Apr 03 '17 at 15:09
  • Here's the data source: https://www.kaggle.com/hhs/health-insurance/ – GT. Apr 03 '17 at 15:22
  • Also, if you look above you will see that I already checked the class and received NULL @MrFlick – GT. Apr 03 '17 at 16:23
  • You checked the class of heathd$insuredChange, not insuredChanged. From your example I would bet they are different. – MrFlick Apr 03 '17 at 17:01
  • @MrFlick I think you're confused. I never used any `insuredChanged` variable. I don't know where you got that from. – GT. Apr 03 '17 at 17:08
  • G.T - MrFlick is quite accurate as your comment above is "It says `> class(healthd$insuredChange) [1] "NULL" – G.T. 2 hours ago` - did you look at my answer below? – B Williams Apr 03 '17 at 17:22
  • You have the line `insuredChange = healthd[[4]]` and you used `aes(y = insuredChange)` in the code you provided above. I'm guessing that `names(healthd)[4] != "insuredChange"` so it's that local variable that's being used, not values from the data.frame. – MrFlick Apr 03 '17 at 17:23
  • I never used the variable `insuredChanged` – GT. Apr 04 '17 at 13:35

1 Answers1

0

First - what you present isn't a reproducible example and nobody wants to sign up to access your data to help you out...

In your code:

states = healthd[[1]]

and

insuredChange = healthd[[4]]

are assigning the columns to the global environment - they are not changing the name of the values in your data.frame. When you use ggplot it is looking for columns in your data.frame with the names that don't exist - hence the NULL statement

 healthd$states = healthd[[1]]
 healthd$insuredChange = healthd[[4]]

will change it to something that should work - though I don't have the data so am not completely sure.

This should now generate the figure you want.

ggplot(healthd, aes(states, insuredChange)) +
       geom_bar(stat="identity") +
       theme(axis.text.x=element_text(angle = 90, hjust = 1))
B Williams
  • 1,992
  • 12
  • 19