I have a dataframe (with some ~300 rows) where one column is called "geneID":
geneID distance pvalue
4 30 0.05
409 0 0.001
60 41 0.02
...
I have a second dataframe that indicates the range of genes that comprise a larger antibiotic biosynthetic gene cluster (there are about 30 gene clusters in the chromosome):
ClusterID start end
Chloramphenicol 100 130
NRPS 403 489
Terpene 5021 5109
...
What I want to do is add another column to dataframe 1 labeled with the corresponding "clusterID" of dataframe 2 if the geneID is between the "start" and "stop" of of that gene cluster:
geneID distance pvalue ClusterID
4 30 0.05 NA
409 0 0.001 NRPS
60 41 0.02 NA
I've tried using vectors as values in a mutate function:
ChIP_table %>%
mutate(ClusterID = case_when((ID >= biosynthetic_clusters$start & ID <= biosynthetic_clusters$end) ~ biosynthetic_clusters$Cluster,
TRUE ~ "NA"))
which didn't work. Not sure where to go from here. I've tried building a for loop but still couldn't figure out a way to use a vector/column value as conditions to sort/label.
Any help would be appreciated!