Numpy array Regex sub

Question

I am sure it should be just one line but I am not able to figure out the best to do this:

import numpy as np
import re
arr = np.array(["AB", "AC", "XAB", "XAC", "AD"])

I want to add "X" in the beginning based on a regex match of "^A".

score 2 · Accepted Answer · answered Oct 24 '14 at 04:17

2

What about this:

print(np.array(list(map(lambda v: re.sub(r'^A','XA', v) ,arr))))
% outputs: ['XAB' 'XAC' 'XAB' 'XAC' 'XAD']

answered Oct 24 '14 at 04:17

Marcin

Thanks a lot! :) I ended up using the regex pattern from @nu11p01n73R instead of "^A" – Kapil Sharma Oct 24 '14 at 04:30
1

Using a list comprehension is shorter: `np.array([re.sub('^A','XA',a) for a in arr])`. Does `arr` need to be an array, or return one? This looks like a list of strings problem, not a `numpy` one. – hpaulj Oct 24 '14 at 06:47
Ohh I see. Yeah - it need not be a numpy array. – Kapil Sharma Oct 24 '14 at 14:02

score 0 · Answer 2 · answered Oct 24 '14 at 04:14

0

You can use the sub function in re module to substitute strings as

>>> import re
>>> str="ABC"
>>> re.sub('^(?=A)','X', str)
'XABC'

^(?=A) is lookahead assertion which matches start postition in anystring that begins with 'A'

answered Oct 24 '14 at 04:14

nu11p01n73R

Thanks. I combined your and Marcin's solution. – Kapil Sharma Oct 24 '14 at 04:31
@KapilSharma glad to hear it worked :) – nu11p01n73R Oct 24 '14 at 04:32

2 Answers2