I have a log file contained many slices like this:
Align set A and merge into set B ...
setA, 4 images , image size 146 X 131
setA, image 1, shape center shift (7, -9) compared to image center
setA, image 2, shape center shift (8, -10) compared to image center
setA, image 3, shape center shift (6, -9) compared to image center
setA, image 4, shape center shift (6, -8) compared to image center
final set B, image size 143 X 129
Write set B ...
Now, I want to extract the numbers in this slice into a table:
| width_A | height_A | shift_x | shift_y | width_B | height_B|
--- | --- | --- | ----| ---
A1 | 146 | 131 | 7 | -9 | 143 | 129
A2 | 146 | 131 | 8 | -10 | 143 | 129
A3 | 146 | 131 | 6 | -9 | 143 | 129
A4 | 146 | 131 | 6 | -8 | 143 | 129
If dividing the procedure into two parts, then:
- text processing, read the text into a dictionary
data
, e.g.,data['A1']['shift_x'] = 7
. - use pandas convert the dictionary into dataframe:
df = pd.DataFrame(data)
But I am not familiar with python text processing:
- Different from Python: How to loop through blocks of lines, my log text are not so well organised;
- regular expression may be a choice, but I can never remember the tricks to classify all kinds of symbols
Does anyone have a good solution for this? Python is preferred. Thanks in advance.