This is a very large dataset and I'm trying to get away from writing for loops in R. Looking for a way to attack what I would usually use a nested loop to do.
For each unique value in the confidence col., I need to extract the row indices for all other rows in the confidence col. that match that value. For example, the first occurrence, (50) would return 1,7,9. Then, using those indices, I want to average the values for the seqs column. Here, the first occurrence (50) would return 1980, 7357, and 3008 and then average these. The indented output would be a data frame with 2 columns: one with a list of unique values for confidence and one with a corresponding list of the average # seqs for each unique confidence value.
input
#seqs confidence
1980 50
1088 52
1099 52
2000 42
7009 45
1092 48
7357 50
5909 42
3008 50
output
ave.#seqs confidence
4115 50
1093.5 52
3954.5 42...