1

I'm trying to upload GTF files into R, but seems that all of the previous methods are defunct.

import(file.gtf)
importgtf(file.gtf)
read.gtf(file.gtf)

Every single one, when I try to install the package for the function, I get the "package ‘package’ is not available for this version of R" Is there an updated method for GTF files? The usual read.table() doesn't work either, as GTF files aren't a straightforward TSVs.

user2954167
  • 155
  • 1
  • 3
  • 14
  • What version of R are you using? Which packages are failing to install? – Nick ODell Oct 31 '22 at 01:39
  • I think it's just a tab separated text file so you just have to skip the first two lines and it should read. I usually use `vroom::vroom()`. – Dan Adams Oct 31 '22 at 02:28
  • As I said in the original question, it's not a TSV. The last column is basically a dictionary of additional values, which would be tedious to unpack by hand – user2954167 Oct 31 '22 at 15:18
  • Sorry - I was misread and was thinking of a .gct file. – Dan Adams Oct 31 '22 at 18:02
  • Could try `rtracklayer::import()`. – Dan Adams Oct 31 '22 at 19:19
  • That gives the error "Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘import’ for signature ‘"data.frame", "missing", "missing"’" – user2954167 Oct 31 '22 at 20:19
  • Without your specific .gtf it's hard to give much more help, although it might have to do with a dependency of rtracklayer. Is there a similarly structured .gtf file in the public domain somewhere you could share to make your problem reproducible? – Dan Adams Oct 31 '22 at 23:18
  • Does this answer your question? [How should I deal with "package 'xxx' is not available (for R version x.y.z)" warning?](https://stackoverflow.com/questions/25721884/how-should-i-deal-with-package-xxx-is-not-available-for-r-version-x-y-z-wa) – Ghoti Nov 01 '22 at 00:11
  • The .gtf files are from this git repository (which is public) https://github.com/vsbuffalo/bds-files/tree/master/chapter-09-working-with-range-data – user2954167 Nov 01 '22 at 00:33

1 Answers1

1

It works for me with rtracklayer::readGFF():

library(rtracklayer)

g <- readGFF("https://raw.githubusercontent.com/vsbuffalo/bds-files/master/chapter-09-working-with-range-data/mm_GRCm38.75_protein_coding_genes.gtf")

head(g)
#>   seqid         source type   start     end score strand phase
#> 1     1 protein_coding gene 3205901 3671498    NA      -    NA
#> 2     1 protein_coding gene 4343507 4360314    NA      -    NA
#> 3     1 protein_coding gene 4490928 4496413    NA      -    NA
#> 4     1 protein_coding gene 4773206 4785739    NA      -    NA
#> 5     1 protein_coding gene 4807788 4886770    NA      +    NA
#> 6     1 protein_coding gene 4857814 4897909    NA      +    NA
#>              gene_id gene_name    gene_source   gene_biotype
#> 1 ENSMUSG00000051951      Xkr4 ensembl_havana protein_coding
#> 2 ENSMUSG00000025900       Rp1        ensembl protein_coding
#> 3 ENSMUSG00000025902     Sox17        ensembl protein_coding
#> 4 ENSMUSG00000033845    Mrpl15 ensembl_havana protein_coding
#> 5 ENSMUSG00000025903    Lypla1 ensembl_havana protein_coding
#> 6 ENSMUSG00000033813     Tcea1 ensembl_havana protein_coding

Created on 2022-11-01 with reprex v2.0.2

Dan Adams
  • 4,971
  • 9
  • 28