2

I have a ndarray with string entries. I also have a second array which contains every string entry of the ndarray uniquely. So i want to replace the strings of the ndarray by the position of the second array, where the string are defined. I have tried this:

import scipy as sp
extern_nodes = sp.array([['B11', 'B6', '-1', '-1', '-1', '-1'],
       ['B9', 'B3', '-1', '-1', '-1', '-1'],
       ['B10', 'B5', '-1', '-1', '-1', '-1'],
       ['B8', 'B2', '-1', '-1', '-1', '-1'],
       ['B16', 'B6', '-1', '-1', '-1', '-1'],
       ['B15', 'B5', '-1', '-1', '-1', '-1'],
       ['B14', 'B3', '-1', '-1', '-1', '-1'],
       ['B12', 'B1', '-1', '-1', '-1', '-1'],
       ['B13', 'B2', '-1', '-1', '-1', '-1']], dtype='<U6')
nodes_sorted = sp.array(['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B8', 'B9', 'B10', 'B11',
       'B12', 'B13', 'B14', 'B15', 'B16'], dtype='<U3')

for i in range(0, len(nodes_sorted)):
    extern_nodes = sp.char.replace(extern_nodes, nodes_sorted[i], str(i))

And I'm getting as result

Out:extern_nodes: array([['01', '5', '-1', '-1', '-1', '-1'],
       ['7', '2', '-1', '-1', '-1', '-1'],
       ['00', '4', '-1', '-1', '-1', '-1'],
       ['6', '1', '-1', '-1', '-1', '-1'],
       ['06', '5', '-1', '-1', '-1', '-1'],
       ['05', '4', '-1', '-1', '-1', '-1'],
       ['04', '2', '-1', '-1', '-1', '-1'],
       ['02', '0', '-1', '-1', '-1', '-1'],
       ['03', '1', '-1', '-1', '-1', '-1']], dtype='<U2')

Which means the entries "B1X" were replaced in the first step by "0X" because of replacing "B1" by "0".

I couldn't find a way to specify exactly matching replacements of strings. My aim is to get in the first step of the for-loop this ndarray (only replacing every "B1" by "0" without replacing other "XB1X" strings..)

extern_nodes = sp.array([['B11', 'B6', '-1', '-1', '-1', '-1'],
   ['B9', 'B3', '-1', '-1', '-1', '-1'],
   ['B10', 'B5', '-1', '-1', '-1', '-1'],
   ['B8', 'B2', '-1', '-1', '-1', '-1'],
   ['B16', 'B6', '-1', '-1', '-1', '-1'],
   ['B15', 'B5', '-1', '-1', '-1', '-1'],
   ['B14', 'B3', '-1', '-1', '-1', '-1'],
   ['B12', '0', '-1', '-1', '-1', '-1'],
   ['B13', 'B2', '-1', '-1', '-1', '-1']], dtype='<U6')
serv-inc
  • 35,772
  • 9
  • 166
  • 188
Search898
  • 69
  • 6

3 Answers3

1

Your code with minimal changes:

import pandas as pd
import scipy as sp
extern_nodes = sp.array([['B11', 'B6', '-1', '-1', '-1', '-1'],
       ['B9', 'B3', '-1', '-1', '-1', '-1'],
       ['B10', 'B5', '-1', '-1', '-1', '-1'],
       ['B8', 'B2', '-1', '-1', '-1', '-1'],
       ['B16', 'B6', '-1', '-1', '-1', '-1'],
       ['B15', 'B5', '-1', '-1', '-1', '-1'],
       ['B14', 'B3', '-1', '-1', '-1', '-1'],
       ['B12', 'B1', '-1', '-1', '-1', '-1'],
       ['B13', 'B2', '-1', '-1', '-1', '-1']], dtype='<U6')
nodes_sorted = sp.array(['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B8', 'B9', 'B10', 'B11',
       'B12', 'B13', 'B14', 'B15', 'B16'], dtype='<U3')
my_nodes = pd.DataFrame(extern_nodes)

for i in range(0, len(nodes_sorted)):
    my_nodes = my_nodes.applymap(lambda x: str(i) if x == nodes_sorted[i] else x)

This uses Pandas and checks for equality.

serv-inc
  • 35,772
  • 9
  • 166
  • 188
1

Please try the code below. You can include it into Your for loop easily.

arr = np.array(['A1','A2','A3','test'],dtype='<U6')
arr[arr=='A2'] = 0
arr

Out[1]: array(['A1', '0', 'A3', 'test'], dtype='<U6')

arr[arr=='A2'] = 0 compares all the elements in the array to the value ('A2' in this case) and assigns them some other value (0).

Sokolokki
  • 833
  • 1
  • 9
  • 19
0

I think you can do:

import re
extern_nodes = [[re.sub('\\bB1','0',y) for y in x] for x in extern_nodes]

\\bB1: matched B1 at the start of a word.

If you want to preserve the first column, do:

[[x[0]] + [re.sub('\\bB1','0',y) for y in x[1:]] for x in extern_nodes]
YOLO
  • 20,181
  • 5
  • 20
  • 40