0

I have a dataset with Butterfly species as rows and the number of plant family-genera-species that they feed on in different columns. I tried to get a frequency plot in R but encountered problems in the script. I could however plot it in SPSS and excel. I want to Use R for this particular set. Thanks a lot in advance.

My dataset is of the following form.

   Butterfly species  #host families  #host genera  #host species
1  xxxxxxxxx                      xx            xx             xx
2  xxxxxxxxx                      xx            xx             xx
   ...

It'll be really helpful if someone could help me with the script for this :)

gagolews
  • 12,836
  • 2
  • 50
  • 75
user6385
  • 17
  • 1
  • 8
  • Because this is a programming question, it is better suited for stackoverflow. However, you are unlikely to get much help there until you make an effort to solve the problem yourself. What have you tried that is not working? – kmm Apr 21 '14 at 12:48
  • I tried using the ggplot2 and table function to get it. But i am not able to produce a plot in the end. I am very new to R. – user6385 Apr 21 '14 at 12:54
  • You will have a much better chance of getting a good answer if you provide a [reproducable example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Jaap Apr 21 '14 at 13:21

1 Answers1

3

As the comments say, you need to post a reproducible example. This assumes there is one row per butterfly species.

# generate reproducible example
set.seed(1)
df <- data.frame(Butterfly=LETTERS[1:10],
                 Families=sample(1:20,10,replace=T),
                 Genera=sample(1:20,10,replace=T),
                 Species=sample(1:20,10,replace=T))

library(ggplot2)
library(reshape2)
gg <- melt(df,id="Butterfly",value.name="Freq", variable.name="Type")
ggplot(gg, aes(x=Butterfly, y=Freq, fill=Type))+
  geom_bar(stat="identity")+
  facet_grid(Type~.)

You can also put everything on one plot (without facets), but IMO it is much less clear.

ggplot(gg, aes(x=Butterfly, y=Freq, fill=Type))+
  geom_bar(stat="identity", position="dodge")

EDIT (Response ot OP's comment)

So now that we have the data, there are several options - all of which are variations on a theme. Since the names of the butterfly families are long, we can rotate the labels:

df <- structure(list(Butterfly_Family = structure(c(4L, 5L, 3L, 6L, 2L, 1L), .Label = c("Hesperiidae", "Lycaenidae", "Nymphalidae", "Papilionidae", "Pieridae", "Riodinidae"), class = "factor"), Family = c(13L, 15L, 55L, 1L, 55L, 33L), Genara = c(50L, 42L, 219L, 2L, 148L, 97L), Species = c(88L, 79L, 307L, 2L, 233L, 137L)), .Names = c("Butterfly_Family", "Family", "Genara", "Species"), class = "data.frame", row.names = c(NA, -6L))

gg <- melt(df,id="Butterfly_Family",value.name="Freq", variable.name="Type")
ggplot(gg, aes(x=Butterfly_Family, y=Freq, fill=Type))+
  geom_bar(stat="identity")+
  theme(axis.text.x=element_text(angle=-90, vjust=0.2))+
  facet_grid(Type~.)

Alternatively, we can rotate the whole graph, using coord_flip().

ggplot(gg, aes(x=Butterfly_Family, y=Freq, fill=Type))+
  geom_bar(stat="identity")+
  coord_flip()+
  facet_grid(Type~.)

Finally, we can rotate the graph and change the facets from row-wise to column-wise.

ggplot(gg, aes(x=Butterfly_Family, y=Freq, fill=Type))+
  geom_bar(stat="identity")+
  coord_flip()+
  facet_grid(.~Type)

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • This is my dataset. I did try what you had suggested and was able to get the plot, but i was unable to get the names of x axis. please help structure(list(Butterfly_Family = structure(c(4L, 5L, 3L, 6L, 2L, 1L), .Label = c("Hesperiidae", "Lycaenidae", "Nymphalidae", "Papilionidae", "Pieridae", "Riodinidae"), class = "factor"), Family = c(13L, 15L, 55L, 1L, 55L, 33L), Genara = c(50L, 42L, 219L, 2L, 148L, 97L), Species = c(88L, 79L, 307L, 2L, 233L, 137L)), .Names = c("Butterfly_Family", "Family", "Genara", "Species"), class = "data.frame", row.names = c(NA, -6L)) – user6385 Apr 22 '14 at 02:24
  • thanks a lot for your response, but i am still getting A,B,C,D etc as the names instead of family names. i noticed that it changed from the family name to A,B,C,D,E etc when i entered the data.frame command. Should i change the ratio or the variables in the dataframe? please help .. thanks a lot in advance. – user6385 Apr 22 '14 at 06:17
  • what should i replace LETTERS to get the names of the butterfly family? Thanks a lot in advance and sorry for all the trouble – user6385 Apr 22 '14 at 06:32
  • i tried it again but i don't know the command to retain the row names. pls help – user6385 Apr 22 '14 at 16:01
  • I was just creating an example to illustrate the procedure. You already have the data, which I called `df` in my example. See my new edit, which simply adds one line. – jlhoward Apr 22 '14 at 16:07
  • Thanks a ton .much appreciated :) and my apologies for the inconvenience caused! – user6385 Apr 22 '14 at 16:13
  • If i have to rank them from the highest value to the lowest value and show it on the plot, how should i go about? I used the {gg$Lepidoptera.Family<-with(gg, factor(Lepidoptera.Family, levels=Lepidoptera.Family[order(LHP.Families)]))} code to order 1 varaible, but was unable to do it with all the 3 variables. – user6385 Apr 24 '14 at 03:23