2

I am trying to create a graph that displays the abundance of species over time. I would like for the final graph to look like something produced using geom_density, with smooth curves for each species. For example, Figure 1 on this site (Figure 1). However, I have not been able to manipulate R into using my y values (abundances) instead of density counts. I did manage to use geom_area, but this isn't really the output that I would like. Does anyone know how to make geom_density accept y values? Or alternatively, plot species abundances with smooth curves?

Example:

data.set <- data.frame(
  Time = c(rep(1, 4),rep(2, 4), rep(3, 4), rep(4, 4)),
  Type = rep(c('a', 'b', 'c', 'd'), 4),
  Value = rpois(16, 10)
)

Where the Value is species abundance, Time is the timepoint that each abundance was recorded and Type represents the four different species.

ggplot(data.set, aes(Time, Value)) + geom_area(aes(fill = Type))

Once plotted, it is very "chunky." I would prefer to use something like geom_density to make smooth curves and then use alpha to make them transparent.

Any help would be appreciated!

aosmith
  • 34,856
  • 9
  • 84
  • 118

2 Answers2

1

You could use spline() to interpolate (as per this answer from 2010)

library(ggplot2)
library(data.table) #for rbindlist

# break data.frame into chunks by type
type_chunks <- split(data.set, data.set$Type)

# apply spline function to each chunk
spline_chunks <- lapply(seq_along(type_chunks), function(i) {
  x <- type_chunks[[i]]
  data.frame(Type=names(type_chunks)[i],
             spline(x$Time, x$Value, n=50)) # choose a value for n
})

# crush chunks back to one data.frame
spline_df <- rbindlist(spline_chunks)

# original plot
ggplot(data.set, aes(Time, Value)) + geom_line(aes(color = Type), size=2)

# plot smoothed version
ggplot(spline_df, aes(x, y)) + geom_line(aes(color = Type), size=2)

Original plot enter image description here

Smoothed version enter image description here

Note I did these as line plots, not area plots, since that matches the post you linked, and area plots display the series as stacked rather than independent.

Community
  • 1
  • 1
arvi1000
  • 9,393
  • 2
  • 42
  • 52
  • Also, I'm sure @hrbrmstr will be along soon to post about `geom_xspline` but that will require installing another package: https://github.com/hrbrmstr/ggalt – arvi1000 Apr 14 '16 at 00:50
0

Interestingly, this is a case where you could use stat_density if the dataset wasn't summarized.

It's pretty easy to expand a simple dataset like this based on the counts summarized in Value, where you add rows based on Value. See options here.

# Make expanded dataset
d2 = data.set[rep(row.names(data.set), data.set$Value), 1:2]
head(d2)

    Time Type
1      1    a
1.1    1    a
1.2    1    a
1.3    1    a
1.4    1    a
1.5    1    a

Then you can make the desired plot using ..count.. for the y aesthetic. You can make density plots, or you can make line plots using stat = "density". Here is an example of the latter.

ggplot(d2, aes(Time, y = ..count.., color = Type)) +
    geom_line(size = 1, stat = "density")

enter image description here

Community
  • 1
  • 1
aosmith
  • 34,856
  • 9
  • 84
  • 118