1

I'd like to take a string like this

A = '[[complex(1,-1), complex(1,1)], [1, 1]]'

and transform it into this numpy array

array([[1.-1.j, 1.+1.j],
   [1.+0.j, 1.+0.j]])

I have tried using this code

string = a.replace("[","").replace("]","").replace("\"",  "").replace(" ","")
np.fromstring(string, dtype=complex, count=-1, sep=',')

but the output is this

array([], dtype=complex128)

if there is at least an easy way to put the matrix A into this form

A = '1-1j, 1+1j, 1, 1'

the code above works. I need to do this with a lot of matrices imported from .csv

danuzco
  • 55
  • 8

1 Answers1

1

One must resist the temptation to use eval, because eval is unsafe and a bad practice. It really is dangerous. If you can be sure that the input is safe because you're controlling the input: change the output of the previous step so that you don't have to use eval in the first place.

Since your use case is very specific (complex(a, b) where neither a nor b can contain any commas nor parentheses) it's easy to write a regex pattern that will help replace your complex calls with complex literals. Once we have that we can use ast.literal_eval to perform a safe conversion from string to numbers which you can feed numpy.

See the following few definitions:

import ast
import re

pattern = re.compile(r'complex\(([^)]+),([^)]+)\)') 

def complex_from_match(match): 
    """Converts a regex match containing 'complex(a, b)' to a complex number""" 
    re = float(match.group(1)) 
    im = float(match.group(2)) 
    return re + 1j*im 

def literalify_complex(s): 
    """Converts a string with a number of the form complex(a, b) into a string literal"""

    return pattern.sub(lambda match: repr(complex_from_match(match)), s)

Here pattern is a regex pattern that matches substring of the form complex(a, b). For each match match.group(1) gives us a and match.group(2) gives b.

The function literalify_complex(s) calls pattern.sub on the input string s, which will call the passed callable (a lambda) on every match for the pattern pattern in s. The lambda I defined makes use of the complex_from_match function which takes the matched string complex(a,b) and effectively turns it into the native python complex number a + 1j*b. Then this complex number is converted into its repr during replacement, effectively turning things like complex(1, 3.4) into 1+3.4j. Here's what the output looks like:

>>> literalify_complex('[[complex(1,-1), complex(1,1)], [1, 1]]')
'[[(1-1j), (1+1j)], [1, 1]]'

>>> literalify_complex('[[complex(0,-3.14), complex(1.57337537832783243243,1e-100)], [1, 1]]')
'[[-3.14j, (1.5733753783278324+1e-100j)], [1, 1]]'

These are now strings that can directly be converted to nested python lists using literal_eval:

>>> ast.literal_eval(literalify_complex('[[complex(1,-1), complex(1,1)], [1, 1]]'))
[[(1-1j), (1+1j)], [1, 1]]

And then these lists can be passed to numpy:

def crazy_complex_string_to_ndarray(s): 
    return np.array(ast.literal_eval(literalify_complex(s)))

# >>> crazy_complex_string_to_ndarray('[[complex(1,-1), complex(1,1)], [1, 1]]')
# array([[1.-1.j, 1.+1.j],
#        [1.+0.j, 1.+0.j]])
#
# >>> crazy_complex_string_to_ndarray('[[complex(0,-3.14),  complex(1.57337537832783243243,1e-100)], [1, 1]]')
# array([[-0.        -3.14e+000j,  1.57337538+1.00e-100j],
#        [ 1.        +0.00e+000j,  1.        +0.00e+000j]])

The nice thing about this approach is that any kind of malformed or malicious input will loudly fail (during the ast.literal_eval step), rather than giving unexpected or harmful results.