Convert String-array into just Array Python

Question

'[[[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n ...\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]]'

I have an string array like above. How can i remove all that " ' " (i mean i want it converts into just array type not String-array.)

I want an array looks like this:

[[[-2048, -2048,-2048, ...,  -2048, -2048, -2048], [-2048, -2048, -2048, ..., -2048, -2048, -2048]  [-2048 -2048 -2048 ... -2048 -2048 -2048]  ...  [-2048 -2048 -2048 ... -2048 -2048 -2048]  [-2048 -2048 -2048 ... -2048 -2048 -2048] [-2048 -2048 -2048 ... -2048 -2048 -2048]][[-2048 -2048 -2048 ... -2048 -2048 -2048]  [-2048 -2048 -2048 ... -2048 -2048 -2048] [-2048 -2048 -2048 ... -2048 -2048 -2048]  ...  [-2048 -2048 -2048 ... -2048 -2048 -2048] [-2048 -2048 -2048 ... -2048 -2048 -2048]]]

Your expected output would have commas in between each element — N Chauhan, Aug 12 '18 at 17:17
That second one is not a valid array either. There are no commas separating those elements. Is the separator supposed to be a space? What are the "..."? Omissions in the sample data, or actually part of the string? Is this supposed to be some form of established format, or something made up? Can you switch to using some other standardised format like JSON instead? — deceze, Aug 12 '18 at 17:17
Sorry. I have already edited! it must separate by commas. " ... " because my array is too long to show all the numbers. — Key Jun, Aug 12 '18 at 17:20

score 1 · Accepted Answer · answered Aug 12 '18 at 17:50

1

This may be a bit overkill, but a safe way to parse this is to define a custom parser using, e.g., pyparsing:

from pyparsing import *

num_expr = Word('-' + nums, nums).setParseAction(lambda t: int(t[0]))
array_expr = nestedExpr('[', ']', num_expr)

d = '[[[-2048 -2048]\r\n [-2048 -2048]]]'
print(array_expr.parseString(d).asList()[0])
# [[[-2048, -2048], [-2048, -2048]]]

answered Aug 12 '18 at 17:50

deceze

510,633
85
743
889

Wow. You're using very advanced skills – Key Jun Aug 12 '18 at 18:43
Creating a parser for a domain specific language isn’t actually as complicated as you’d think; though yes, I’ve just stumbled into this pretty recently too. – deceze Aug 12 '18 at 18:53

score 0 · Answer 2 · 2018-08-12T17:34:03.270

0

Warning: eval() can be used to execute arbitrary Python code. You should never use eval() with untrusted strings. (See Security of Python's eval() on untrusted strings?)

eval('variable='+'your string here')

That function runs a string-type piece of code. You should be very careful about this practice. It is highly discouraged to code like that if avoidable. You may have high security and stability breaches if the string is not exactly what you expect. It is interesting as something that Python has, but I would tell you to work out a way around the problem in a different way. If you provide more information we may help you out.

Also I would tell you if you can to obtain that string in JSON format and then use Python's native JSON parser; that'd be much better practice.

Edit I have just noticed that your string is not parseable for Python, even if you executed the code, because you don't have the appropriate commas as mentioned by other users in the comment box above. You would need to parse that and then call eval, which is even more complicated and discouraged, though definitely possible.

Edit 2 A way to add a comma before each space and then execute the code as mentioned couuld easily be done by calling str.replace(" ", ", ")

edited Aug 12 '18 at 17:34

answered Aug 12 '18 at 17:20

Seems like that is not what i want! – Key Jun Aug 12 '18 at 17:26
What i want is like the question i asked. I have seen an example of using `eval`. It is not related to my situation. – Key Jun Aug 12 '18 at 17:28
@KeyJun I think it is indeed related to your situation, as you put it. Your problem is your piece of code is not correctly parsed, so you need to add the commas before calling eval, as I explained. There is no other way of doing that apart from creating your own parser for python arrays, which by the way it's almost the same as undergoing the trouble of adding the commas. – Aug 12 '18 at 17:29
So are you looking to make a function that records depth into nested lists, and splits each number by comma `str.split(', ')`, producing a list of list (of lists) of numbers? @KeyJun – N Chauhan Aug 12 '18 at 17:31
He does not have the comma in the native string either @NChauhan, so you would need to split by spaces first – Aug 12 '18 at 17:32
It is going too complicated to convert this array. My native string-array is hard to convert. I have to use another way. Thank you all. – Key Jun Aug 12 '18 at 17:34

score 0 · Answer 3 · answered Aug 12 '18 at 17:33

Use re.sub to remove the unnecessary \r\n and add comma wherever necessary and then use ast.literal_eval to convert the cleaned up string to list

>>> import ast
>>> import re
>>> s = '[[[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n ...\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]\r\n\r\n [[-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  ...\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]\r\n  [-2048 -2048 -2048 ... -2048 -2048 -2048]]]'
>>> s = s.replace(' ...', '')   # Not needed for your original string
>>> l = ast.literal_eval(re.sub(r'(\d?)(?:\r\n)*\s+', r'\1, ', s))
>>> print (l)
[[[-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048]], [[-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048]], [[-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048]], [[-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048]], [[-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048]], [[-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048], [-2048, -2048, -2048, -2048, -2048, -2048]]]

Certainly better to use ast.literal_eval if you handle exceptions correctly as pointed out in https://stackoverflow.com/questions/15197673/using-pythons-eval-vs-ast-literal-eval — , Aug 12 '18 at 17:36
@J.C.Rocamonde. `ast.literal_eval` is always safe and secure than using `eval`. Never, never use eval if you are not sure about the data source — Sunitha, Aug 12 '18 at 17:38
i got this error `ValueError: malformed node or string: <_ast.Ellipsis object at 0x7f08e19f2b00>` — Key Jun, Aug 12 '18 at 17:38
If your original data source has `...` (ellipsis), then you need that line `s = s.replace(' ...', '')` — Sunitha, Aug 12 '18 at 17:40
@Sunitha that is what I pointed out up there in my other answer :). However, evaluating strings dynamically does not seem to be a very good practice to me either. That means there is a bottleneck somewhere in the application data implementation-wise. — , Aug 12 '18 at 17:40

Convert String-array into just Array Python

3 Answers3