0

I have a script which is using the columns of two data sets to join them and intersect the data as follows:

gff_file <-read.table("gencode.v44.long_noncoding_RNAs.gtf", sep="\t")
gff_file$V1<-gsub("^chr","",gff_file$V1)
gff_file$V1=as.numeric(gff_file$V1)
vcf_file <-read.csv("pha005195_test.csv", sep=";", header = F)
by <- join_by( V3 == V1, between(V4, V4, V5))
df<- inner_join(vcf_file, gff_file, by)

The gff_file$V1 can be either chr1 chr2 etc or chrX chrY. In one case need to be specified to join as.numeric and in the other case as.charachter. The vcf_file$V3 is also either 1,2,3 or X,Y.

I can write again the same lines for the case of character and execute two times the same but with

gff_file$V1=as.character(gff_file$V1)

Is there any simpler way to write this in R and combine both cases ?

  • 4
    I think just turning everything into a character is probably your best bet – Mark Sep 01 '23 at 10:39
  • 5
    but this would be a lot easier to answer if you included some of your input. dput(df) or dput(head(df)) is your friend. Add to question if you can please! – Mark Sep 01 '23 at 10:46
  • yes thsi is exactly what ia did but i was thinking if there is alternative. Thank you ! – Dimitris Zisis Sep 01 '23 at 10:47
  • 1
    DimitrisZisis, you're more likely to get more help with the sapmle data as Mark suggested. See https://stackoverflow.com/q/5963269 , [mcve], and https://stackoverflow.com/tags/r/info for discussions on why this is important, and suggested methods including `dput`, `data.frame`, and `read.table` for giving us unambiguous sample data. Thank you! – r2evans Sep 01 '23 at 12:32

0 Answers0