Question is similar to many previous questions on SO. But seems distinct enough.
I have data file which has following output. The numbers are to be extracted. The number of elements in the number block is random and there is one empty line above and below the number block. Aim is to extract the numbers and possibly assign them to python numpy array.
string 1
234034 6361234 45096 12342134 2878814 456456
125294 7341234 17234 23135 768234 54134123
213203 6.25 2.36 1.0 0.0021
string 2
298034 20481234 45096 12502134 2870814 456456
19875294 441284 98234 27897135 251021524 768234 54134123
2.3261
string 3
744034 6644034 75096 5302134 298978814 456456
6767294 70441234 330234 200135 867234 54004123
204203 22015 120158 125 21 625 11 5 2.021
Expected output : Numbers from all blocks arranged as bash arrays or numpy(python) arrays. Numeric values shown below are only representative.
- Bash array : '744034','6644034','75096', .. .. '21','625','11','5','2.021'
or
Numpy array : [744034,6644034,75....,625,11,5,2.021]
My use case prefers numpy array though.
Taking cue from previous question, tried this sed -n '/^symmetry 1$/,/^symmetry 2$/p' file
but the output is null possibly due to space in the start and end search terms.
Tried python, since eventually I need the numbers as np array. From the question and help in comments, I get one block using the following code
import sys
import re
F=open(sys.argv[1])
text=F.read()
reg=re.compile(r'string 1(.*?)string 2',re.DOTALL)
for match in reg.finditer(text):
print (match.groups())
output,
string 1
744034 6644034 75096 5302134 298978814 456456
6767294 70441234 330234 200135 867234 54004123
204203 22015 120158 125 21 625 11 5 2.021
string 2
Need suggestions.