Is there an R function for finding shared traits among variables?

Question

I have a data set of plants and plant traits. It is a large data set with over 150 plants and over 300 different traits. However I do not have data for all 300 traits for all of the 150 plants. Some plants have data for 100 traits, other plants have data for only 2 or 3 traits.

I have figured out how to isolate which plants have the most trait data, but I can’t figure out how to isolate which traits these plants have in common

For example. I have 10 plants, numbered 1-10, and each of these 10 plants has data for 75 traits, with trait numbers varying from 1-3000. So each plant has 75 different traits, but with some overlap. I want to find which traits overlap. I want to analyze all of the traits that they share/have in common, so I need to isolate the shared traits.

Is there an easy way to do this in R? It seems like there should be a relatively easy way, but I can’t quite figure it out.

My data set looks something like this, just much larger.

Sample Data Table

In this example I would want to highlight Traits #1 and #4, because those are the two which have data for all three plants.

I hope this all makes sense. Thanks everyone in advance for your help!

R is basically just a fancy calculator. Asking if there is an R function to do something, it's really a useful question. You first should know what statistical technique you want to use in order to create clusters. There's probably no one correct way to do it. It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. — MrFlick, Nov 04 '19 at 16:02
Consider keeping your data long (i.e., two columns of *plant* and *trait*). Then simply aggregate for counts: `agg_df <- aggregate(plant ~ trait, df, FUN=length)`, even sort descending to find highest plant counts: `agg_df <- with(agg_df, agg_df[order(-plant),])` — Parfait, Nov 04 '19 at 17:07

Is there an R function for finding shared traits among variables?

0 Answers0