0

I am attempting a PCoA with a binary data set. I have been able to create the distance matrix and run the analysis all with the help of the Riffomonas Project YouTube channel. I have the results of the analysis, the only problem I have is that I can't colour the points on my ggplot. I set the row names as samples but I seem to be missing something.

cal_fem_dist <- dist(cfd, method = "binary")
cal_fem_dist2 <- as.matrix(cal_fem_dist)

pcoa_cf <- cmdscale(cal_fem_dist2, eig=T, add=T) 
positions <- pcoa_cf$points
head(positions)
colnames(positions) <-c("pcoa1", "pcoa2")


percent_explained <- 100 * pcoa_cf$eig /sum(pcoa_cf$eig)
pretty_pe <- round(percent_explained[1:2], digits = 1)
pretty_pe
labs <- c(glue("PCo 1 ({pretty_pe[1]}%)"),
          glue("PCo 2 ({pretty_pe[2]}%)"))

cal_fem_data2 = as.data.frame(cal_fem_data2)
head(positions)
positions %>%
  as_tibble (rownames="samples") %>%
  ggplot(aes(x=pcoa1, y=pcoa2, color = "samples")) + 
  geom_point() +
  labs(x=labs[1], y=labs[2])`

If I code in the colours as per "samples" all the points are labeled as "samples" and are all the same colour. I did see in Pat's tutorial that he inner_joined characters to colour points according to by I just want to colour the points according to the characters used for the PCoA, i.e., the sample names.

To see my full code ( to get the distance matrix) and the data used for the analysis, I have uploaded it to GitHub: link to my public Github for stack overflow

I fixed this by going back and redoing the PCoA analysis and then converting the results from PCoA into a data frame. This then let me assign groups and colour the dots according to the groups.

L Tyrone
  • 1,268
  • 3
  • 15
  • 24
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Mar 23 '23 at 13:48
  • I've included a link in the edit for GitHub with my code and data – cassbarker_za Mar 24 '23 at 07:42
  • I fixed this by changing how I ran the pcoa and then plotted it – cassbarker_za Mar 27 '23 at 09:19
  • Please add your solution as an answer so that others may benefit from your works. Thanks – L Tyrone Mar 28 '23 at 02:14

1 Answers1

0
pcoa <- cmdscale(dist_matrix, eig = TRUE, add = TRUE)
#convert pcoa results into data frame that can be plotted
pcoa_df <- data.frame(pcoa$points)
colnames(pcoa_df) <- c("PCo1", "PCo2")
pcoa_df$Species <- factor(cal_fem_data2$Morphospecies) #add group of interest, mine was Morphospecies in the data frame cal_fem_data2

calf <- ggplot(pcoa_df, aes(x = PCo1, y = PCo2, color = Species)) + 
  geom_point(size = 2) +
  xlab("PCo1") +
  ylab("PCo2") + 
  ggtitle("Female") +
  theme_classic()
calf