Replace only exactly matching string python ndarry

Question

I have a ndarray with string entries. I also have a second array which contains every string entry of the ndarray uniquely. So i want to replace the strings of the ndarray by the position of the second array, where the string are defined. I have tried this:

import scipy as sp
extern_nodes = sp.array([['B11', 'B6', '-1', '-1', '-1', '-1'],
       ['B9', 'B3', '-1', '-1', '-1', '-1'],
       ['B10', 'B5', '-1', '-1', '-1', '-1'],
       ['B8', 'B2', '-1', '-1', '-1', '-1'],
       ['B16', 'B6', '-1', '-1', '-1', '-1'],
       ['B15', 'B5', '-1', '-1', '-1', '-1'],
       ['B14', 'B3', '-1', '-1', '-1', '-1'],
       ['B12', 'B1', '-1', '-1', '-1', '-1'],
       ['B13', 'B2', '-1', '-1', '-1', '-1']], dtype='<U6')
nodes_sorted = sp.array(['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B8', 'B9', 'B10', 'B11',
       'B12', 'B13', 'B14', 'B15', 'B16'], dtype='<U3')

for i in range(0, len(nodes_sorted)):
    extern_nodes = sp.char.replace(extern_nodes, nodes_sorted[i], str(i))

And I'm getting as result

Out:extern_nodes: array([['01', '5', '-1', '-1', '-1', '-1'],
       ['7', '2', '-1', '-1', '-1', '-1'],
       ['00', '4', '-1', '-1', '-1', '-1'],
       ['6', '1', '-1', '-1', '-1', '-1'],
       ['06', '5', '-1', '-1', '-1', '-1'],
       ['05', '4', '-1', '-1', '-1', '-1'],
       ['04', '2', '-1', '-1', '-1', '-1'],
       ['02', '0', '-1', '-1', '-1', '-1'],
       ['03', '1', '-1', '-1', '-1', '-1']], dtype='<U2')

Which means the entries "B1X" were replaced in the first step by "0X" because of replacing "B1" by "0".

I couldn't find a way to specify exactly matching replacements of strings. My aim is to get in the first step of the for-loop this ndarray (only replacing every "B1" by "0" without replacing other "XB1X" strings..)

extern_nodes = sp.array([['B11', 'B6', '-1', '-1', '-1', '-1'],
   ['B9', 'B3', '-1', '-1', '-1', '-1'],
   ['B10', 'B5', '-1', '-1', '-1', '-1'],
   ['B8', 'B2', '-1', '-1', '-1', '-1'],
   ['B16', 'B6', '-1', '-1', '-1', '-1'],
   ['B15', 'B5', '-1', '-1', '-1', '-1'],
   ['B14', 'B3', '-1', '-1', '-1', '-1'],
   ['B12', '0', '-1', '-1', '-1', '-1'],
   ['B13', 'B2', '-1', '-1', '-1', '-1']], dtype='<U6')

Possible duplicate of [Replace exact substring in python](https://stackoverflow.com/questions/31697043/replace-exact-substring-in-python) — pault, Jan 11 '19 at 13:58
I've added the numpy tag because sp.array is identical to np.array, and you aren't using any scipy specific operations. — Mad Physicist, Jan 11 '19 at 14:11

score 1 · Answer 1 · answered Jan 11 '19 at 14:08

Your code with minimal changes:

import pandas as pd
import scipy as sp
extern_nodes = sp.array([['B11', 'B6', '-1', '-1', '-1', '-1'],
       ['B9', 'B3', '-1', '-1', '-1', '-1'],
       ['B10', 'B5', '-1', '-1', '-1', '-1'],
       ['B8', 'B2', '-1', '-1', '-1', '-1'],
       ['B16', 'B6', '-1', '-1', '-1', '-1'],
       ['B15', 'B5', '-1', '-1', '-1', '-1'],
       ['B14', 'B3', '-1', '-1', '-1', '-1'],
       ['B12', 'B1', '-1', '-1', '-1', '-1'],
       ['B13', 'B2', '-1', '-1', '-1', '-1']], dtype='<U6')
nodes_sorted = sp.array(['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B8', 'B9', 'B10', 'B11',
       'B12', 'B13', 'B14', 'B15', 'B16'], dtype='<U3')
my_nodes = pd.DataFrame(extern_nodes)

for i in range(0, len(nodes_sorted)):
    my_nodes = my_nodes.applymap(lambda x: str(i) if x == nodes_sorted[i] else x)

This uses Pandas and checks for equality.

You're welcome. You can mark this as the solution by clicking the check mark to the left. — serv-inc, Jan 12 '19 at 17:42

score 1 · Answer 2 · answered Jan 11 '19 at 14:15

Please try the code below. You can include it into Your for loop easily.

arr = np.array(['A1','A2','A3','test'],dtype='<U6')
arr[arr=='A2'] = 0
arr

Out[1]: array(['A1', '0', 'A3', 'test'], dtype='<U6')

arr[arr=='A2'] = 0 compares all the elements in the array to the value ('A2' in this case) and assigns them some other value (0).

YOLO · Answer 3 · 2019-01-11T13:58:36.600

0

I think you can do:

import re
extern_nodes = [[re.sub('\\bB1','0',y) for y in x] for x in extern_nodes]

\\bB1: matched B1 at the start of a word.

If you want to preserve the first column, do:

[[x[0]] + [re.sub('\\bB1','0',y) for y in x[1:]] for x in extern_nodes]

edited Jan 11 '19 at 13:58

answered Jan 11 '19 at 13:45

YOLO

20,181
5
20
40

I'm getting with your code the same result :/ The first column should not be replaced – Search898 Jan 11 '19 at 13:53
Try `r"\bB1\b"` as the pattern. You need the boundaries on both sides. – pault Jan 11 '19 at 13:56

Replace only exactly matching string python ndarry

3 Answers3