3

Is there a way to generalize Element-wise string concatenation in numpy to n > 2 cases, and also perform the join with a space " " delimiter. The np.char.add function only works for 2 arrays, and there's no option to add a delimiter.

import numpy as np

strings1 = np.array(["a", "b", "c"], dtype=np.str)
strings2 = np.array(["d", "e", "f"], dtype=np.str)
strings3 = np.array(["g", "h", "i"], dtype=np.str)

# Concatenate several string dtype arrays with a space delimiter
# I.e. something like strings1 + " " + strings2 + " " + strings3
# Code??

Desired:

array(['a d g', 'b e h', 'c f i'], dtype='<U5')
weiji14
  • 477
  • 7
  • 14
  • As discussed in your link, `numpy` does not have fast compiled string methods. The `np.char` functions use the regular Python string methods, and have about the same speed as regular iteration. Same goes for converting the string dtype to object dtype. So there may be answers that are compact and/or convenient, but don't expect special performance. – hpaulj Aug 10 '20 at 00:38
  • Yes, thanks for mentioning that. I'm not that tied up with performance, more just readable code that does the job. But it's good for performance-minded people to know that :) – weiji14 Aug 10 '20 at 01:47

3 Answers3

7

Try np.apply_along_axis

arr_list = [strings1, strings2, strings3]
arr_out = np.apply_along_axis(' '.join, 0, arr_list)

In [35]: arr_out
Out[35]: array(['a d g', 'b e h', 'c f i'], dtype='<U5')
weiji14
  • 477
  • 7
  • 14
Andy L.
  • 24,909
  • 4
  • 17
  • 29
  • 1
    By way of explanation this turns `arr_list` into a 2d array of strings and applies `join` to each column. – hpaulj Aug 10 '20 at 01:46
2

You could use a for loop to help you achieve this:

import numpy as np

strings1 = np.array(["a", "b", "c"], dtype=np.str)
strings2 = np.array(["d", "e", "f"], dtype=np.str)
strings3 = np.array(["g", "h", "i"], dtype=np.str)

Create a list of your strings:

strings=[strings1, strings2, strings3]

Create an empty list:

list_for_new_array=[]

For loop to iterate through each array in strings, and create a list of strings containing array items separated by a space:

for string in strings:
    i=""
    for item in string:
        i+=item+" "
    list_for_new_array.append(i)

Create new array with list created in for loop:

new_array= np.array(list_for_array, dtype='<U5')

  • Thanks for answering! Good to have a for-loop example that can be adapted for more advanced string concatenation jobs (e.g. with different delimiters). – weiji14 Aug 10 '20 at 02:00
0

Try this:

strings = [strings1,strings2,strings3]
np.array([ ' '.join([' '.join(x[i].tolist()) for x in strings ]) for i in range(len(strings)) ],dtype='<U5' )
Marios
  • 26,333
  • 8
  • 32
  • 52