-3

I have a table with pairs (Lnc/gene) and their distance but I need to do filtration to get for each Lnc the closest gene

mytable

example

Genex Lnc1 1KB GeneY Lnc4 20KB

Thank you in advance

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
JaO
  • 27
  • 3
  • 3
    Images and screenshots can be a nice addition to a post, but please make sure the post is still clear and useful without them. **Don't post images of code, data or error messages.** Instead copy and paste or type the actual code/data/message into the post directly. – rsjaffe Aug 11 '18 at 20:59

2 Answers2

1

Below is one possible dplyr solution. Please try to make your questions reproducible by sharing a minimal dataset/code.

# importing the necessary package
library(dplyr)

# reproducing your data
df <- data_frame(
  Gene = c("Gene X", "Gene X", "Gene X", "Gene Y"),
  Lnc = c("Lnc1", "Lnc2", "Lnc3", "Lnc4"),
  `Distance (KB)` = c(1, 300, 200, 20)
)

# grouping by Gene and choosing the minimum Gene-Lnc distance 
df %>%
  group_by(Gene) %>%
  filter(`Distance (KB)` == min(`Distance (KB)`))

# # A tibble: 2 x 3
# # Groups:   Gene [2]
#   Gene   Lnc   `Distance (KB)`
#   <chr>  <chr>           <dbl>
# 1 Gene X Lnc1                1
# 2 Gene Y Lnc4               20
OzanStats
  • 2,756
  • 1
  • 13
  • 26
0

in case if only one pair of Lnc, Gene, with the closest distance, then you can use also below

   df%>%
   group_by(Gene)%>%
   arrange(`Distance (KB)`)%>%
   summarise(Lnc=first(Lnc), Dist=first(`Distance (KB)`))
Nar
  • 648
  • 4
  • 8