4

I have been working on this for hours, cant seem to get this right. The boxplot only gives me flat vertical lines, its driving me crazy. I get the same input with or without factor function

ggplot(df2,aes(x = factor(Location),y=Final.Result)) + geom_boxplot()

Solved! there are some data values such as "< 0.005" which R picks up as string and converts everything to factor.

user3146687
  • 389
  • 1
  • 3
  • 11
  • 1
    Please paste the result of running dput(df2) into your question. – BrodieG Dec 30 '13 at 14:53
  • And include the code that produces the plot you have right now: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Paul Hiemstra Dec 30 '13 at 14:58
  • It's one heck of a list. Here is the pastebin link. http://pastebin.com/v1KRb6UM – user3146687 Dec 30 '13 at 15:00
  • @PaulHiemstra its in the screenshot, but here it is ggplot(df2,aes(x = factor(Location),y=Final.Result)) + geom_boxplot() – user3146687 Dec 30 '13 at 15:03
  • I would recommend editing the code into your question. In addition, if your input data is big, you can provide us with a suitably small subset which still reproduces the issue. You can even create dummy data. Finally, your input data does not look that big, I'd suggest pasting it into your question (SO will be smart what to show) for keeping the data available later on. – Paul Hiemstra Dec 30 '13 at 15:04
  • how do i paste a table in? Seen people do it before. – user3146687 Dec 30 '13 at 15:12

3 Answers3

6

You got those lines because variable Final.Result in your data frame is factor and not numeric (you can check it with function str()).

> str(df2)
'data.frame':   66 obs. of  3 variables:
 $ Location    : Factor w/ 17 levels "BOON KENG RD BLK 6 (DS)",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Parameter   : Factor w/ 54 levels "Aluminium","Ammonia (as N)",..: 37 37 37 37 37 37 37 37 37 37 ...
 $ Final.Result: Factor w/ 677 levels "< 0.0005","< 0.001",..: 645 644 654 653 647 643 647 647 646 646 ...

Try to convert those values to numeric (as in df2 there is no non numeric values). This will work only for df2 but if your whole data frame has those "< 0.0005","< 0.001" values, you should decide how to treat them (replace with NA, or some small constant).

df2$Final.Result2<-as.numeric(as.character(df2$Final.Result))
ggplot(df2,aes(x = factor(Location),y=Final.Result2)) + geom_boxplot()
Didzis Elferts
  • 95,661
  • 14
  • 264
  • 201
4

This answer is only related to the title of the question, but this question ranks top if I google "ggplot2 boxplot only lines" and there was no other helpful search result on that search term, so I feel it fits here well:

Boxplots only work if you specify the quantity as y aestetic.

EDIT: Since ggplot 3.3.0, there is the orientation= parameter that allows to change the orientation. See the ggplot NEWS on this

Compare

 ggplot(mtcars, aes(x = factor(cyl), y = disp)) + geom_boxplot()

sample boxplot with x = quantity and y = grouping

which gives correct boxplots with

 ggplot(mtcars, aes(y = factor(cyl), x = disp)) + geom_boxplot()

boxplot with incorrect aesthetic setting

which gives only lines instead of box plots.

To obtain horizontal box plots, use coord_flip():

 ggplot(mtcars, aes(x = factor(cyl), y = disp)) + 
   geom_boxplot() + coord_flip()

vertical boxplot

akraf
  • 2,965
  • 20
  • 44
4

Another reason you get flat lines instead of boxes is that you summarized the numeric values in your table (eg calculating mean, median, etc.) and now boxplot() sees only a single value.

carlite71
  • 393
  • 1
  • 4
  • 16