Python regex to find connected digits

Question

I have raw txt files and need to use regex to search each digit separated by space.

Question, data format is like:

   6   3   1   0
   7   3   1   0
   8   35002   0
   9   34104   0

My regex is:

(?P<COORD>\d+)

The matched output for first two lines are, (6,3,1,0) and (7,3,1,0) which are correct. However, it doesn't apply to last two lines, their output are (8, 35002, 0) and (9, 34104, 0). The correct grouping numbers should be (8, 3, 5002, 0) and (9, 3, 4104, 0). How can I solve this?

This is a fixed-width text, see https://stackoverflow.com/questions/4914008/how-to-efficiently-parse-fixed-width-files — Wiktor Stribiżew, Nov 29 '21 at 15:50
[`(?P(?<= {4})|(?<= {3})\d|(?<= {2})\d{2}|(?<= )\d{3}|\d{4})`](https://regex101.com/r/1sgwM4/1) — logi-kal, Nov 29 '21 at 16:37
@horcrux This code works. How can I rename these 4 groups of digits in different name? — Kelvin Lo, Nov 30 '21 at 12:12
`my_regex = "".join([r" *(?P(?<= {4})|(?<= {3})\d|(?<= {2})\d{2}|(?<= )\d{3}|\d{4})" % i for i in range(1,5)])` gives you [this regex](https://regex101.com/r/AiWgZO/1) — logi-kal, Nov 30 '21 at 14:09
@horcrux thank you! I wish I can give you the best answer if you don't mind adding an answer — Kelvin Lo, Nov 30 '21 at 15:35

score 0 · Answer 1 · answered Nov 29 '21 at 16:13

0

If the numbers are aligned and the width of the columns are fixed, You can use

width = 4
for line in lines:
    columns = [ line[j: j + width] for j in range(0, len(line), width)]
    numbers = list(map(lambda x: int(x.strip()), columns))
    # or a one liner
    print(list(int(line[j:j+width].strip()) for j in range(0, len(line), width)))

answered Nov 29 '21 at 16:13

Shanavas M

1,581
1
17
24

Is it possible to use regex? Because I have other lines in string. – Kelvin Lo Nov 30 '21 at 12:13

Python regex to find connected digits

1 Answers1