I am attempting a PCoA with a binary data set. I have been able to create the distance matrix and run the analysis all with the help of the Riffomonas Project YouTube channel. I have the results of the analysis, the only problem I have is that I can't colour the points on my ggplot. I set the row names as samples but I seem to be missing something.
cal_fem_dist <- dist(cfd, method = "binary")
cal_fem_dist2 <- as.matrix(cal_fem_dist)
pcoa_cf <- cmdscale(cal_fem_dist2, eig=T, add=T)
positions <- pcoa_cf$points
head(positions)
colnames(positions) <-c("pcoa1", "pcoa2")
percent_explained <- 100 * pcoa_cf$eig /sum(pcoa_cf$eig)
pretty_pe <- round(percent_explained[1:2], digits = 1)
pretty_pe
labs <- c(glue("PCo 1 ({pretty_pe[1]}%)"),
glue("PCo 2 ({pretty_pe[2]}%)"))
cal_fem_data2 = as.data.frame(cal_fem_data2)
head(positions)
positions %>%
as_tibble (rownames="samples") %>%
ggplot(aes(x=pcoa1, y=pcoa2, color = "samples")) +
geom_point() +
labs(x=labs[1], y=labs[2])`
If I code in the colours as per "samples" all the points are labeled as "samples" and are all the same colour. I did see in Pat's tutorial that he inner_joined characters to colour points according to by I just want to colour the points according to the characters used for the PCoA, i.e., the sample names.
To see my full code ( to get the distance matrix) and the data used for the analysis, I have uploaded it to GitHub: link to my public Github for stack overflow
I fixed this by going back and redoing the PCoA analysis and then converting the results from PCoA into a data frame. This then let me assign groups and colour the dots according to the groups.