2

I am trying to do something similar to an old post: plotting - original post

For my analysis, I am interested in whether different mammal hosts have different flea communities. The original post I have linked to has 2 different solutions for the ellipses. My problem is when I run both the 1st solution and then the general solution I get vastly different looking plots while I think they should be very similar. Below is my code.

My question is: Am I doing something incorrectly or which code produces the correct figure? Or is there a better new code I should be using instead to display differences in flea communities by host species?

Thanks, Amanda

Link to ARG_comm data Link to ARG_env data

library(vegan)
library(BiodiversityR)
library(MASS)

comm_mat <- read.csv("d:/fleaID/ARG_comm.csv",header=TRUE)
env <- read.csv("d:/fleaID/ARG_env.csv",header=TRUE)

library(ggplot2)
sol <-metaMDS(comm_mat)
MyMeta=env

#originalresponse
NMDS = data.frame(MDS1 = sol$points[,1],     MDS2=sol$points[,2],group=MyMeta$host)
NMDS.mean=aggregate(NMDS[,1:2],list(group=NMDS$group),mean)

veganCovEllipse<-function (cov, center = c(0, 0), scale = 1, npoints = 100) 
{
  theta <- (0:npoints) * 2 * pi/npoints
  Circle <- cbind(cos(theta), sin(theta))
  t(center + scale * t(Circle %*% chol(cov)))
}

df_ell <- data.frame()
for(g in levels(NMDS$group)){
  df_ell <- rbind(df_ell, cbind(as.data.frame(with(NMDS[NMDS$group==g,],
                                                   veganCovEllipse(cov.wt(cbind(MDS1,MDS2),wt=rep(1/length(MDS1),length(MDS1)))$cov,center=c(mean(MDS1),mean(MDS2)))))
                                ,group=g))
}

ggplot(data = NMDS, aes(MDS1, MDS2)) + geom_point(aes(color = group)) +
  geom_path(data=df_ell, aes(x=MDS1, y=MDS2,colour=group), size=1, linetype=2)+
  annotate("text",x=NMDS.mean$MDS1,y=NMDS.mean$MDS2,label=NMDS.mean$group)

#update - can use se (standard error) or sd (standard dev)
#update
NMDS = data.frame(MDS1 = sol$points[,1], MDS2 =        sol$points[,2],group=MyMeta$host)

plot.new()
ord<-ordiellipse(sol, MyMeta$host, display = "sites", 
             kind = "se", conf = 0.95, label = T)

df_ell <- data.frame()
for(g in levels(NMDS$group)){
  df_ell <- rbind(df_ell, cbind(as.data.frame(with(NMDS[NMDS$group==g,],
                                                   veganCovEllipse(ord[[g]]$cov,ord[[g]]$center,ord[[g]]$scale)))
                            ,group=g))
}

ggplot(data = NMDS, aes(MDS1, MDS2)) + geom_point(aes(color = group)) +
geom_path(data=df_ell, aes(x=NMDS1, y=NMDS2,colour=group), size=1, linetype=2)
Community
  • 1
  • 1
Amanda Goldberg
  • 169
  • 1
  • 11
  • It is very difficult to diagnose the problem without the original dataset. Could you make a reproducible example for us? – Rahul Mar 15 '17 at 02:31
  • @Rahul, I have added links to a subset of the data. I double checked the subset of the data and it produces similar results to the full data set. – Amanda Goldberg Mar 16 '17 at 23:32
  • 1
    Your example is still too tedious to use for busy people. For instance, the link to your data does not point to a csv file and cannot be used directly. We still do not know in which way the solutions look very different. However, there is one peculiar thing: in the second example when you use `ordiellipse`, you set `display = "species"` whereas your `NMDS` data and first case implicitly had `"sites"`. – Jari Oksanen Mar 23 '17 at 09:19
  • Jari, I created a link to .csv files so you only need to download them now. I also changed the 'display=' back to "sites" from "species" but the plots still look quite different between the two examples. Thanks. – Amanda Goldberg Mar 28 '17 at 21:42

1 Answers1

4

There are two key differences between the first and second plot methods. The first method is calculating the ellipse paths using standard deviation and no scaling. The second method is using standard error and is also scaling the data. Thus, the plot produced with the first method can also be achieved with the second method by making the necessary changes to the ordiellipse function (kind='sd', not 'se'), and removing the scale (ord[[g]]$scale) from the veganCovEllipse function. I have included this below so you can see for yourself.

Ultimately, both plots are correct, it just depends on what you want to show. As long as you specify the use of standard error or deviation, either can be used. As for whether or not to scale, this really depends on your data. This link may be helpful: Understanding `scale` in R.

First method:

sol <-metaMDS(comm_mat)
MyMeta=env

#originalresponse
NMDS = data.frame(MDS1 = sol$points[,1],     MDS2=sol$points[,2],group=MyMeta$host)
NMDS.mean=aggregate(NMDS[,1:2],list(group=NMDS$group),mean)

veganCovEllipse<-function (cov, center = c(0, 0), scale = 1, npoints = 100) 
{
  theta <- (0:npoints) * 2 * pi/npoints
  Circle <- cbind(cos(theta), sin(theta))
  t(center + scale * t(Circle %*% chol(cov)))
}

df_ell <- data.frame()
for(g in levels(NMDS$group)){
  df_ell <- rbind(df_ell, cbind(as.data.frame(with(NMDS[NMDS$group==g,],
                                                   veganCovEllipse(cov.wt(cbind(MDS1,MDS2),wt=rep(1/length(MDS1),length(MDS1)))$cov,center=c(mean(MDS1),mean(MDS2)))))
                                ,group=g))
}

ggplot(data = NMDS, aes(MDS1, MDS2)) + geom_point(aes(color = group)) +
  geom_path(data=df_ell, aes(x=MDS1, y=MDS2,colour=group), size=1, linetype=2)+
  annotate("text",x=NMDS.mean$MDS1,y=NMDS.mean$MDS2,label=NMDS.mean$group)

Gives:

plot1

Second method:

plot.new()

ord<-ordiellipse(sol, MyMeta$host, display = "sites", 
                 kind = "sd", conf = .95, label = T)

df_ell <- data.frame()
for(g in levels(NMDS$group)){
  df_ell <- rbind(df_ell, cbind(as.data.frame(with(NMDS[NMDS$group==g,],
            veganCovEllipse(ord[[g]]$cov,ord[[g]]$center)))
            ,group=g))
}

plot2<-ggplot(data = NMDS, aes(MDS1, MDS2)) + geom_point(aes(color = group)) +
  geom_path(data=df_ell, aes(x=NMDS1, y=NMDS2,colour=group), size=1, linetype=2)+
  annotate("text",x=NMDS.mean$MDS1,y=NMDS.mean$MDS2,label=NMDS.mean$group)

plot2

Also gives:

plot2

Community
  • 1
  • 1
J.Con
  • 4,101
  • 4
  • 36
  • 64
  • 1
    Thank you so much. Your reply was very helpful. I now have a much better understanding of what is going on and how to proceed with my data. I really appreciate you taking the time to answer my question. – Amanda Goldberg Apr 24 '17 at 02:03
  • My pleasure. Consider marking this as the answer if it solved your question. :) – J.Con Apr 24 '17 at 03:16
  • This would be so helpful, but right now there seems to be a problem with the calculation of the df_ell dataframe; giving the group as `as.factor()` into the NMDS dataset results in a df_ell dataframe with points for a single ellipse. I seem not to be the only person with this problem, see [this question](https://stackoverflow.com/questions/63075125/plot-ordiellipse-from-vegan-in-ggplot2-example-not-working) – Benno May 29 '21 at 16:25