I have the following vector:
my.vector = c("4M1D5M15I1D10M", "3M", "4M2I3D")
And I'd like to transform it into the following vector:
my.result = c("21N", "3N", "7N")
The logic for such results is as follows,
for "4M1D5M15I1D10M"
I added all the numbers, except the ones that are preceding an "I"
character, i.e., 4+1+5+1+10=21 (I did not add 15 because it precedes an "I"
), and then paste an N right after 21, becoming "21N"
.
Same for "3M"
, there is no "I"
character so it just becomes "3N"
;
and same for the last one, 4+3=7 (I did not add 2 because it precedes an "I"
), becoming "7N"
.
Note that my.vector is extremely large so I want to use the parallel capabilities of the HPC server using mclapply. Ideally I'd run something like this to get my result:
my.result = unlist(mclapply(my.vector, my.adding.function, mc.cores = ncores))
For defining my function I tried the following:
my.adding.function <- function(x)
{
tmp = unlist(strsplit(x, "\\d+I"))
tmp2 = unlist(strsplit(tmp, "M|D|S|N"))
tmp3 = sum(as.numeric(tmp2))
return(paste(tmp3, "N",sep=""))
}
Not sure about the efficiency of such function though...