45

Is this a bug?

import numpy as np
a1=np.array(['a','b'])
a2=np.array(['E','F'])

In [20]: add(a1,a2)
Out[20]: NotImplemented

I am trying to do element-wise string concatenation. I thought Add() was the way to do it in numpy but obviously it is not working as expected.

Amro
  • 123,847
  • 25
  • 243
  • 454
Dave31415
  • 2,846
  • 4
  • 26
  • 34
  • 1
    As the name implies, number is for numbers. Python itself has pretty good string operations. Why not just use that? `"".join(["a", "b"])` works fine. – Keith Mar 31 '12 at 18:29
  • 1
    I was looking at this http://docs.scipy.org/doc/numpy/reference/routines.char.html – Dave31415 Mar 31 '12 at 18:39
  • 2
    That's cool. But: "All of them are based on the string methods in the Python standard library.". So if you just use the standard library you can write code that doesn't depend on numpy. – Keith Mar 31 '12 at 18:44
  • 1
    The `add` operation does not do the same thing as `join`. numpy's add can be useful for multidimensional arrays or nested lists. – gypaetus Dec 03 '15 at 17:50

7 Answers7

78

This can be done using numpy.char.add. Here is an example:

>>> import numpy as np
>>> a1 = np.array(['a', 'b'])
>>> a2 = np.array(['E', 'F'])
>>> np.char.add(a1, a2)
array(['aE', 'bF'], 
      dtype='<U2')

(This was previously known as numpy.core.defchararray.add, and that name is still usable, but numpy.char.add is the preferred alias now.)

There are other useful string operations available for NumPy data types.

user2357112
  • 260,549
  • 28
  • 431
  • 505
Mike T
  • 41,085
  • 18
  • 152
  • 203
  • 7
    As noted in the docstring of the module, "the preferred alias for `defchararray` is `numpy.char`", so you can just say `np.char.add`. – jdehesa Jul 31 '17 at 14:54
14

You can use the chararray subclass to perform array operations with strings:

a1 = np.char.array(['a', 'b'])
a2 = np.char.array(['E', 'F'])

a1 + a2
#chararray(['aE', 'bF'], dtype='|S2')

another nice example:

b = np.array([2, 4])
a1*b
#chararray(['aa', 'bbbb'], dtype='|S4')
Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
7

This can (and should) be done in pure Python, as numpy also uses the Python string manipulation functions internally:

>>> a1 = ['a','b']
>>> a2 = ['E','F']
>>> map(''.join, zip(a1, a2))
['aE', 'bF']
Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
Niklas B.
  • 92,950
  • 18
  • 194
  • 224
4

Another solution is to convert string arrays into arrays of python of objects so that str.add is called:

>>> import numpy as np
>>> a = np.array(['a', 'b', 'c', 'd'], dtype=np.object)   
>>> print a+a
array(['aa', 'bb', 'cc', 'dd'], dtype=object)

This is not that slow (less than twice as slow as adding integer arrays).

jonathanrocher
  • 1,200
  • 7
  • 10
2

One more basic, elegant and fast solution:

In [11]: np.array([x1 + x2 for x1,x2 in zip(a1,a2)])
Out[11]: array(['aE', 'bF'], dtype='<U2')

It is very fast for smaller arrays.

In [12]: %timeit np.array([x1 + x2 for x1,x2 in zip(a1,a2)])
3.67 µs ± 136 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [13]: %timeit np.core.defchararray.add(a1, a2)
6.27 µs ± 28.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [14]: %timeit np.char.array(a1) + np.char.array(a2)
22.1 µs ± 319 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

For larger arrays, time difference is not much.

In [15]: b1 = np.full(10000,'a')    
In [16]: b2 = np.full(10000,'b')    

In [189]: %timeit np.array([x1 + x2 for x1,x2 in zip(b1,b2)])
6.74 ms ± 66.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [188]: %timeit np.core.defchararray.add(b1, b2)
7.03 ms ± 419 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [187]: %timeit np.char.array(b1) + np.char.array(b2)
6.97 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Gaurav Singhal
  • 998
  • 2
  • 10
  • 25
1

Adding to Niklas B. answer as in later versions of Python this may have changed because as of Python 3.10 this will result in a map object.

To fix this you need to add the list function..

>>> a1 = ['a','b']
>>> a2 = ['E','F']
>>> list(map(''.join, zip(a1, a2)))  # <--- See here we have added list()
['aE', 'bF']
DiegoV
  • 11
  • 3
0

To convert the list of integers [10,20,30] to a list of strings ["10k","20k","30k"] I did the following

import numpy as np
b =np.arange(10,100,10)
d=[]
for i in b:
  c=str(i)+"k"
  d.append(c)
h0r53
  • 3,034
  • 2
  • 16
  • 25
  • 1
    Thank you for your interest in contributing to the Stack Overflow community. This question already has quite a few answers—including one that has been extensively validated by the community. Are you certain your approach hasn’t been given previously? **If so, it would be useful to explain how your approach is different, under what circumstances your approach might be preferred, and/or why you think the previous answers aren’t sufficient.** Can you kindly [edit] your answer to offer an explanation? – Jeremy Caney Aug 23 '23 at 02:35