I'm trying to create a venn diagram to help me inspect how many shared variables (species) there are between participant groups. I have a dataframe with dimensions 97 (participants) x 320. My first 2 columns are participant_id and participant_group respectively, and the rest 318 columns are the names of the species with their respective counts. I want to create a venn diagram which will tell me how many species are shared between all the groups. Here is a reproducible example.
participant_id <- c("P01","P02","P03","P04","P05","P06","P07","P08","P09","P10", "P11", "P12", "P13", "P14", "P15")
participant_group <- c("control", "responsive", "resistant", "non-responsive", "control", "responsive", "resistant", "non-responsive", "resistant", "non-responsive", "control", "responsive", "non-responsive", "control", "resistant")
A <- c (0, 54, 23, 4, 0, 2, 0, 35, 0, 0, 45, 0, 1, 99, 12)
B <- c (10, 0, 1, 0, 4, 65, 0, 1, 52, 0, 0, 15, 20, 0, 0)
C <- c (0, 0, 0, 5, 35, 0, 0, 45, 0, 0 , 0, 22, 0, 89, 50)
D <- c (0, 0, 45, 0, 1, 0, 0, 0, 56, 32, 0, 0, 40, 0, 0)
E <- c (0, 0, 40, 5, 0, 0, 0, 45, 0, 1, 76, 0, 34, 56, 31)
F <- c (0, 64, 1, 5, 0, 0, 80, 0, 0, 1, 76, 0, 34, 0, 32)
G <- c (12, 5, 0, 0, 80, 45, 0, 0, 76, 0, 0, 0, 0, 32, 11)
H <- c (0, 0, 0, 5, 0, 0, 80, 0, 0, 1, 0, 0, 34, 0, 2)
example_df <- data.frame(participant_id, participant_group, A, B, C, D, E, F, G, H)
I can see all the wonderful venn diagram packages out there, but I'm struggling to format my data correctly. I have started with:
example_df %>%
group_by(participant_group) %>%
dplyr::summarise(across(where(is.numeric), sum)) %>%
mutate_if(is.numeric, ~1 * (. > 0))
So now I have an indication whether a species (A,B,C, etc) is present (1) or absent (0) within every group. Now, I want to see the overlap of species between the groups through a venn diagram (something like this https://statisticsglobe.com/venn-diagram-with-proportional-size-in-r ). However, I am a little bit stuck on what to do next. Does anybody have any ideas? I hope this makes sense! Thanks for your time.
When using the code from @Paul Stafford Allen, I get this diagram but the goal here is to have something that shows shared presence/absence for species (A,B,C, etc) between groups irrespective of the counts.