0

I have a dataframe with three colmuns; name of data point, group number assigned to that data point and species (data is animal related, and data points belong to one of two species).

Any given row looks like this

Name         |  Group Number  |  Species
Data Point A |       3        |     1

I would like to split groups only if that group contains above 90% of only one species, e.g if group 3 is 10 rows long and has 9 rows belonging only to either species 1 or species 2, then it satisfies my requirements and should be split. I have looked into using the split function as well as the filter functions from dplyr but I can't seem to figure out how to get r to split groups with this percentage-based requirement. Any help would be useful, thank you!

samkart
  • 6,007
  • 2
  • 14
  • 29
  • Welcome. Can you please provide the first 10 rows or so of your data? – Todd Burus Mar 17 '20 at 11:59
  • Welcome to SO. Please see [this](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on how to make a great reproducible example. In this case, it would be helpful to see a sample of what your dataframe looks like, as well as what your final desired result should look like after split. – Ben Mar 17 '20 at 13:49
  • I am not sure how to post rows of my data. However, it is just three columns. Any given row may look like this; (A_1,9 19 collared), that's the name of the data point (A_1,9) the group number it was sorted into via clustering algorithms (19) and the species it belongs to (collared). I would like to extract group 19 only if the data points that have been sorted into that group are either >90% collared or >90% of the other species (pied) – Giacomo Delgado Mar 21 '20 at 12:50

0 Answers0