Looping over large List, finding best fitting homologe to gene

Question

I have a large List of Genes, i want to look up the homologes for.

I also have a large dataframe with potential homologes. The tenth column of this Dataframe inherits a number, describing the fitting. The larger the number, the better.

I am trying to loop over this large List of Genes.

For each unique Gene in the List, i want to select the best fitting homologe gene.

The output should be a dataframe with one line per Gene, describing the best fitting homologe.

Please provide [example data](https://stackoverflow.com/q/5963269/680068). — zx8754, Jun 11 '19 at 09:15

score 0 · Answer 1 · answered Jun 11 '19 at 07:38

0

Tidyverse solution, assuming you have a column with an gene_id per gene and the fitting score is in a column called score:

library(tidyverse)

df %>% group_by(gene_id) %>% filter(score == max(score)) %>% ungroup()

answered Jun 11 '19 at 07:38

Peter

215
2
8

This seems like a tidy solution =,D. But whenever i try to load the package into library after installing it, the is the following error. Error in library(tidyverse) : there is no package called ‘tidyverse’ – Jonas Engelhardt Jun 13 '19 at 19:28
@JonasEngelhardt you actually only need the `dplyr` package for that and not the whole tidyverse. – Peter Jun 14 '19 at 07:06

Looping over large List, finding best fitting homologe to gene

1 Answers1