replace an empty numpy string with 0

Question

I have to numpy string array which look like this:

[['0', '', '12.12', '140.65', '', ''],
['3', '', '10.45', '154.45', '', ''],
['5', '', '15.65', '184.74', '', '']]

What I need to do is to replace the empty cells with a number in order to convert it into a float array. I can't just delete the columns because in some cases the empty cells are filled. I tried this:

data = np.char.replace(data, '','0').astype(np.float64)

But this will just put a 0 everywhere between all characters which ends up in this:

[[0, 0, 1020.0102, 104000.0605, 0, 0],
[30, 0, 1000.0405, 105040.0405, 0, 0],
[50, 0, 1050.0605, 108040.0704, 0, 0]]

I can't figure out why python does that? I searched via google but couldn't find a good explanation for numpy.char.replace. Can anyone explain to me how it works?

Possible duplicate of [Numpy array, fill empty values for a single column](https://stackoverflow.com/questions/20512101/numpy-array-fill-empty-values-for-a-single-column) — Grzegorz Oledzki, Nov 16 '17 at 09:15
Your 'empty' cells contain commas. Char.replace applies the regular string replace method to each element. — hpaulj, Nov 16 '17 at 14:28

timgeb · Accepted Answer · 2017-11-16T09:16:18.347

4

>>> a = np.array([['0', '', '12.12', '140.65', '', ''],
... ['3', '', '10.45', '154.45', '', ''],
... ['5', '', '15.65', '184.74', '', '']])
>>> a[a == ''] = 0
>>> a.astype(np.float64)
array([[   0.  ,    0.  ,   12.12,  140.65,    0.  ,    0.  ],
       [   3.  ,    0.  ,   10.45,  154.45,    0.  ,    0.  ],
       [   5.  ,    0.  ,   15.65,  184.74,    0.  ,    0.  ]])

edited Nov 16 '17 at 09:16

answered Nov 16 '17 at 09:13

timgeb

76,762
20
123
145

You might want to have empty strings in the initial array like OP – lxop Nov 16 '17 at 09:15
@ixop that's a copy-paste fail. Give me a second. – timgeb Nov 16 '17 at 09:15
1

Perfect, this works great! Thank you for your fast answer! – Toggo Nov 16 '17 at 09:21
2

@Toggo `replace` operates on substrings, so it more or less checks each position of each of your strings for the to replace string. Since '' matches everywhere, you will have '0' inserted everywhere. – Paul Panzer Nov 16 '17 at 09:31
@Paul Panzer Ok, I guessed it would be this way. Is there a possibility to make replace operate with the whole string rather than a substring? – Toggo Nov 16 '17 at 09:34
@Toggo `replace` itself I don't think so. I think this answer is the way to do it. – Paul Panzer Nov 16 '17 at 10:06

Fagin Vazio · Answer 2 · 2017-11-16T10:29:13.463

data = np.char.replace(data, '','0')

It seems to replace all empty places, like '' has one place , and '0' has two places, '12.12' has 6 places. The result is

[['000' '0' '01020.01020' '0104000.06050' '0' '0']
 ['030' '0' '01000.04050' '0105040.04050' '0' '0']
 ['050' '0' '01050.06050' '0108040.07040' '0' '0']]

Try this :

import numpy as np

a = np.array([['0', '', '12.12', '140.65', '', ''],
              ['3', '', '10.45', '154.45', '', ''],
              ['5', '', '15.65', '184.74', '', '']])

#a[np.where(a == '')] = '0'
a[a == ''] = '0'

a = a.astype(np.float64)

print(a)

score 0 · Answer 3 · answered Apr 29 '21 at 00:05

I know that this is an old question, but unfortunately, the accepted answer does not work properly today. If you do the [a == ''] comparison you will get a FutureWarning:

FutureWarning: elementwise comparison failed; returning scalar
instead, but in the future will perform elementwise comparison

one method that will do the trick with no waring is to use the numpy.where()

   import numpy as np
   a = np.array([['0', '', '12.12', '140.65', '', ''],
               ['3', '', '10.45', '154.45', '', ''],
               ['5', '', '15.65', '184.74', '', '']])

   result = np.where(a=='', '0', a)
   print(result)

The result is

[['0' '0' '12.12' '140.65' '0' '0']  
 ['3' '0' '10.45' '154.45' '0' '0']  
 ['5' '0' '15.65' '184.74' '0' '0']]

replace an empty numpy string with 0

3 Answers3