I am looking for a way to replace all occurrences of 'A' with 1, 'T' with 2, 'C' with 8, and 'G' with 16 in a byte array. How can this be done?
Asked
Active
Viewed 141 times
2 Answers
1
require "narray"
class NArray
def cast(type)
a = NArray.new(type,*self.shape)
a[] = self
a
end
end
conv = NArray.int(256)
atcg = NArray.to_na('ATCG', NArray::BYTE).cast(NArray::LINT)
conv[atcg] = [1,2,8,16]
seq_str = 'ABCDAGDE'
seq_ary = NArray.to_na(seq_str, NArray::BYTE).cast(NArray::LINT)
p conv[seq_ary]
#=> NArray.int(8):
# [ 1, 0, 8, 0, 1, 16, 0, 0 ]

masa16
- 461
- 3
- 5
-
Very nice. How do you reckon that compares speedwise to tr: search = 'ACGTUMRWSYKVHDBN'; replace = [1, 2, 4, 8, 2, 5, 9, 3, 12, 6, 10, 13, 7, 11, 14, 15].pack("C*"); string.tr!(search, replace) - ? – maasha Jan 06 '12 at 07:39
-
Test code (https://gist.github.com/1573753) shows tr is faster than NArray. If data is provided as a string, use String#tr. If it is in the context of numerical processing, NArray is applicable. – masa16 Jan 07 '12 at 04:23
0
Is it what you are looking for?
h = {'A' => 1, 'T' => 2, 'C' => 8, 'G' => 16}
a = ['A', 'B', 'C', 'D', 'A', 'G', 'D', 'E']
result = a.map {|c| h.include?(c) ? h[c] : c }

basgys
- 4,320
- 28
- 39
-
Did you note that I am not shifting base of ABC, but specifically wanting to assign new values to ATCG? – maasha Jan 05 '12 at 12:37
-
-
-
Please check NArray -> http://narray.rubyforge.org/ - I would like to do this on NArray level. Otherwise I should think that tr would be a lot faster than your proposal. – maasha Jan 05 '12 at 13:01
-
I glanced at NArray API, but I have to admit that I'm not really good at matrix and stuff like that. But as you said, there is probably a better solution and I would be interested to see it just out of curiosity. Good luck :) – basgys Jan 09 '12 at 20:34