1

I have 18 different species with size classes and count of observations for each size class. I am trying to create a for loop that will create a separate histogram for each species (not facet as there are too many species). For loops are my weakest area in R and I have often done more code to avoid them but with 18 species that is no longer an option.

Here is a sample of my formatted data:

Species    Size.Class   TotalCount
P. porphyreus   35  1
P. porphyreus   20  5
P. porphyreus   25  5
P. insularis    35  2
P. insularis    5   10
P. insularis    10  10
P. insularis    30  12
P. insularis    25  35
P. insularis    15  36
P. insularis    20  36
P. cyclostomus  30  2
P. cyclostomus  35  2
P. cyclostomus  25  4
P. cyclostomus  15  7
P. cyclostomus  20  8

When I create a histogram for one species I get the intended result:

ggplot(subset(Spcount,Species %in% c("P. porphyreus")),aes(x=Size.Class))+
  geom_histogram(binwidth=5)+
  ggtitle("P. porphyreus Histogram")+
  labs(y= "Total Count", x = "Size Class")

But when I try to automate it using this for loop:

FOR (i in Spcount$Species) {
  ggplot(subset(Spcount,Species %in% c("i")),aes(x=Size.Class))+
    geom_histogram(binwidth=5)+
    ggtitle("i Histogram")+
    labs(y= "Total Count", x = "Size Class") 
}

I get one graph titled "i Histogram" but is blank with no errors or warnings.

CourtneyK
  • 15
  • 2
  • 2
    Firstly, it's more easy to help you if you share reproducible data, for example, the output of `dput(head(Spcount))`. Secondly, you're subsetting based on a string `"i"`, not an object `i`, so your subset will return an empty dataframe. – heds1 Aug 21 '19 at 22:04
  • With 18 taxa, facets should work fairly well - just make the plot bigger, or consider breaking into two 3*3 plots. – Richard Telford Aug 21 '19 at 22:21
  • Have you tried using purrr instead of a loop. You could put the species in a list and pass it through a predefined histogram function where the only variance is the input species. This may be of help: https://stackoverflow.com/questions/57298510/how-to-generate-multiple-similar-ggplots-together/57314675#57314675 – Edgar Zamora Aug 22 '19 at 15:12
  • 1
    @RichardTeldord, your suggestion did provide a nice result, thank you. I also understand the for loop construct that I will need for future use, so thank you MaaniB as well. – CourtneyK Aug 22 '19 at 19:24

1 Answers1

0

You should subset over unique values of Species, by for (i in unique(Spcount$Species))

First, I make your sample data:

Spcount <- data.frame(
  Species = c(
    "P. porphyreus", "P. porphyreus", "P. porphyreus",
    "P. insularis", "P. insularis", "P. insularis", "P. insularis",
    "P. insularis", "P. insularis", "P. insularis", "P. cyclostomus", 
    "P. cyclostomus", "P. cyclostomus", "P. cyclostomus", "P. cyclostomus"
    ),
  Size.Class = c(
    35, 20, 25, 35, 5, 10, 30, 25, 15, 20, 30, 35, 25, 15, 20
  ),
  TotalCount = c(
    1, 5, 5, 2, 10, 10, 12, 35, 36, 36, 2, 2, 4, 7, 8
  )
)

Then,

subseted_Spcount = 0
plot = 0
for (i in unique(Spcount$Species)) {
  subseted_Spcount = subset(Spcount, Species == i)
  plot <- ggplot(subseted_Spcount, aes(x = Size.Class)) +
    geom_histogram(binwidth = 5) +
    ggtitle(paste0(i, " Histogram")) +
    labs(y= "Total Count", x = "Size Class")
  print(plot)
}

Do not forget to use Next plot (Ctrl + Alt + F12) and Previous plot (Ctrl + Alt + F11) to see different histograms.

maaniB
  • 595
  • 6
  • 23
  • I have used this code looking at other variables (Bar graph with facets of the different location) and it works great except for the species that only have one data point. For these its creating one wide bar that spans the whole x axis. Any suggestions? – CourtneyK Aug 26 '19 at 19:34
  • Dear @CourtneyK. It may be due to the large `binwidth`. Decrease it to achieve more desirable bars. However, for small sample sizes or variable with few data points, **histograms** are not the best visualization method of the distributions. **Boxplots** are better plots for such variables. – maaniB Aug 26 '19 at 20:47