-2

I have multiple fasta sequences within a text file, looking like this:

>header1
ACTGACTG
>header2
ATGCATGC
...

I would like to apply a function all of the sequences at once. Is there a function achieving this?

Every answer will be appreciated.

Elif
  • 29
  • 3
  • 1
    If you're looking for parallel computing there are R packages like `parallel` that can help. But the question is too braod to be answerable as it is, please edit giving more details. – Rui Barradas Jul 12 '20 at 20:46
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jul 12 '20 at 22:04
  • 1
    Here's a tutorial on using GNU parallel in bioinformatics: https://gist.github.com/Brainiarc7/7af2ab5e88ef238da2d9f36b4be203c0 (this site is for programming-related questions; https://bioinformatics.stackexchange.com/ is probably what you want in future) – jared_mamrot Jul 12 '20 at 23:13
  • 1
    See if this package helps - [seqinr](https://cran.r-project.org/package=seqinr). Clarify what is the expected output - "apply a function" - is too broad. – zx8754 Jul 13 '20 at 07:39

1 Answers1

1

The answer is simple = sapply(). If you want to apply function e.g. to a list of some objects, you use sapply() method, which is a map() function (you may know this from python). Here is an example:

v <- sample(1:100, 10)
> v
 [1] 92 69 87 42  7 33 51 62 26 80
f <- function(x){
+     # T if even else F
+     return(!x %% 2)
+ }
> sapply(v, FUN = f)
 [1]  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE

Example with DNA:

> library('dplyr')
> v <- c('ATGCTAGCT', 'GTGTACGTAC')
> sapply(v, FUN = function(dna){
+     return(dna %>% tolower)
+ })
   ATGCTAGCT   GTGTACGTAC 
 "atgctagct" "gtgtacgtac" 
777moneymaker
  • 697
  • 4
  • 15